spark版本 2.2.0

日志里面的信息:

WARN RowBasedKeyValueBatch: Calling spill() on RowBasedKeyValueBatch. Will not spill but return 0.

What could be the reason for this warning? Is this something I should care about or can I safely ignore it?

回答1

As indicated here this warning means that your RAM is full and that part of the RAM contents are moved to disk.

See also the Spark FAQ

Does my data need to fit in memory to use Spark?

No. Spark's operators spill data to disk if it does not fit in memory, allowing it to run well on any sized data. Likewise, cached datasets that do not fit in memory are either spilled to disk or recomputed on the fly when needed, as determined by the RDD's storage level.

回答2

I guess this message is worse than a simple warning : it is on the edge of being an error.

Have a look at the source code :

/**
* Sometimes the TaskMemoryManager may call spill() on its associated MemoryConsumers to make
* space for new consumers. For RowBasedKeyValueBatch, we do not actually spill and return 0.
* We should not throw OutOfMemory exception here because other associated consumers might spill
*/
public final long spill(long size, MemoryConsumer trigger) throws IOException {
logger.warn("Calling spill() on RowBasedKeyValueBatch. Will not spill but return 0.");
return 0;
}

here : https://github.com/apache/spark/blob/master/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/RowBasedKeyValueBatch.java

So I would say here you are on an infinite loop of "needing to spill but actually not spilling".

来源:https://stackoverflow.com/questions/46907447/meaning-of-apache-spark-warning-calling-spill-on-rowbasedkeyvaluebatch

引用:https://www.e-learn.cn/topic/3560880

最新文章

  1. HBase Zookeeper 安装学习
  2. MyEclipse的项目中把 java EE 5 Libraries 删掉后怎么重新导入
  3. knockoutjs入门
  4. 实体写到redis写不进去--误把类当成实体类
  5. Qt编写自定义控件插件路过的坑及注意事项
  6. iOS之手势滑动返回功能-b
  7. Eclipse配置Git
  8. POJ 1458-Common Subsequence(线性dp/LCS)
  9. Android 开发笔记 “Sqlite Cursor 使用”
  10. KindeEditor图片上传插件用法
  11. C# devExpress GridControl 行中行 子行 多级行
  12. DNS over TLS到底有多牛?你想知道的都在这儿
  13. IO流(二)
  14. Html链接标签:
  15. Python进阶(一)
  16. docker部署rabbitMQ
  17. JAVAFX开发桌面应用
  18. 按键精灵对APP自动化测试(上)
  19. scheduling.quartz.CronTriggerBean has interface org.quartz.CronTrigger as super class
  20. 12.文件系统fs

热门文章

  1. 【学习笔记】QT从入门到实战完整版(按钮和信号槽)(1)
  2. drf-认证、权限、频率、过滤、排序、分页
  3. SQL优化的七个方面
  4. 可能是最简单的本地GPT3 对话机器人,支持OpenAI 和 Azure OpenAI
  5. C语言排序 冒泡 选择 快排
  6. JZOJ 1038. 【SCOI2009】游戏
  7. python之路76 路飞项目 企业项目类型、软件开发流程、路飞项目需求、pip永久换源、虚拟环境、路飞项目前后端创建、包导入、后端项目目录调整
  8. 使用 UnoCSS shortcuts 简化 class
  9. 优化 Win11 资源管理器打开文件夹速度
  10. 2023年2月份CKA考试历程