spark查看DF的partition数目及每个partition中的数据量【集群模式】
2024-08-21 16:01:33
println("--------------------"+data.rdd.getNumPartitions) // 获取DF中partition的数目
val partitions = data.rdd.glom().collect() // 获取所有data下所有的partition,返回一个partition的集合
for(part <- partitions){
println(part.getClass.getName + "::::::::" + part.length) // 每个partition中的数据量
}
结果:
--------------------100
[Lorg.apache.spark.sql.Row;::::::::61516
[Lorg.apache.spark.sql.Row;::::::::61656
[Lorg.apache.spark.sql.Row;::::::::61991
[Lorg.apache.spark.sql.Row;::::::::61269
[Lorg.apache.spark.sql.Row;::::::::61654
[Lorg.apache.spark.sql.Row;::::::::61780
[Lorg.apache.spark.sql.Row;::::::::62059
[Lorg.apache.spark.sql.Row;::::::::61675
[Lorg.apache.spark.sql.Row;::::::::61339
[Lorg.apache.spark.sql.Row;::::::::61783
[Lorg.apache.spark.sql.Row;::::::::61620
[Lorg.apache.spark.sql.Row;::::::::61883
[Lorg.apache.spark.sql.Row;::::::::61631
[Lorg.apache.spark.sql.Row;::::::::61930
[Lorg.apache.spark.sql.Row;::::::::61451
[Lorg.apache.spark.sql.Row;::::::::61797
[Lorg.apache.spark.sql.Row;::::::::61367
[Lorg.apache.spark.sql.Row;::::::::61647
[Lorg.apache.spark.sql.Row;::::::::61488
[Lorg.apache.spark.sql.Row;::::::::61584
[Lorg.apache.spark.sql.Row;::::::::61733
[Lorg.apache.spark.sql.Row;::::::::61491
[Lorg.apache.spark.sql.Row;::::::::61809
[Lorg.apache.spark.sql.Row;::::::::61062
[Lorg.apache.spark.sql.Row;::::::::61658
[Lorg.apache.spark.sql.Row;::::::::61599
[Lorg.apache.spark.sql.Row;::::::::61911
[Lorg.apache.spark.sql.Row;::::::::61602
[Lorg.apache.spark.sql.Row;::::::::61348
[Lorg.apache.spark.sql.Row;::::::::61677
[Lorg.apache.spark.sql.Row;::::::::61722
[Lorg.apache.spark.sql.Row;::::::::61482
[Lorg.apache.spark.sql.Row;::::::::61714
[Lorg.apache.spark.sql.Row;::::::::61241
[Lorg.apache.spark.sql.Row;::::::::61737
[Lorg.apache.spark.sql.Row;::::::::62015
[Lorg.apache.spark.sql.Row;::::::::62062
[Lorg.apache.spark.sql.Row;::::::::61557
[Lorg.apache.spark.sql.Row;::::::::61607
[Lorg.apache.spark.sql.Row;::::::::61175
[Lorg.apache.spark.sql.Row;::::::::61653
[Lorg.apache.spark.sql.Row;::::::::61460
[Lorg.apache.spark.sql.Row;::::::::61705
[Lorg.apache.spark.sql.Row;::::::::61492
[Lorg.apache.spark.sql.Row;::::::::61340
[Lorg.apache.spark.sql.Row;::::::::61767
[Lorg.apache.spark.sql.Row;::::::::61756
[Lorg.apache.spark.sql.Row;::::::::61793
[Lorg.apache.spark.sql.Row;::::::::61417
[Lorg.apache.spark.sql.Row;::::::::61376
[Lorg.apache.spark.sql.Row;::::::::62039
[Lorg.apache.spark.sql.Row;::::::::61571
[Lorg.apache.spark.sql.Row;::::::::61849
[Lorg.apache.spark.sql.Row;::::::::61553
[Lorg.apache.spark.sql.Row;::::::::61612
[Lorg.apache.spark.sql.Row;::::::::61980
[Lorg.apache.spark.sql.Row;::::::::61714
[Lorg.apache.spark.sql.Row;::::::::62376
[Lorg.apache.spark.sql.Row;::::::::61884
[Lorg.apache.spark.sql.Row;::::::::61273
[Lorg.apache.spark.sql.Row;::::::::61669
[Lorg.apache.spark.sql.Row;::::::::61695
[Lorg.apache.spark.sql.Row;::::::::61515
[Lorg.apache.spark.sql.Row;::::::::61247
[Lorg.apache.spark.sql.Row;::::::::61909
[Lorg.apache.spark.sql.Row;::::::::61879
[Lorg.apache.spark.sql.Row;::::::::61913
[Lorg.apache.spark.sql.Row;::::::::61199
[Lorg.apache.spark.sql.Row;::::::::61678
[Lorg.apache.spark.sql.Row;::::::::61619
[Lorg.apache.spark.sql.Row;::::::::61909
[Lorg.apache.spark.sql.Row;::::::::61406
[Lorg.apache.spark.sql.Row;::::::::61775
[Lorg.apache.spark.sql.Row;::::::::61559
[Lorg.apache.spark.sql.Row;::::::::61773
[Lorg.apache.spark.sql.Row;::::::::61888
[Lorg.apache.spark.sql.Row;::::::::61634
[Lorg.apache.spark.sql.Row;::::::::61786
[Lorg.apache.spark.sql.Row;::::::::61666
[Lorg.apache.spark.sql.Row;::::::::61519
[Lorg.apache.spark.sql.Row;::::::::61563
[Lorg.apache.spark.sql.Row;::::::::61481
[Lorg.apache.spark.sql.Row;::::::::61295
[Lorg.apache.spark.sql.Row;::::::::61343
[Lorg.apache.spark.sql.Row;::::::::61750
[Lorg.apache.spark.sql.Row;::::::::61328
[Lorg.apache.spark.sql.Row;::::::::61650
[Lorg.apache.spark.sql.Row;::::::::61541
[Lorg.apache.spark.sql.Row;::::::::61397
[Lorg.apache.spark.sql.Row;::::::::61505
[Lorg.apache.spark.sql.Row;::::::::61761
[Lorg.apache.spark.sql.Row;::::::::61795
[Lorg.apache.spark.sql.Row;::::::::62291
[Lorg.apache.spark.sql.Row;::::::::61566
[Lorg.apache.spark.sql.Row;::::::::61213
[Lorg.apache.spark.sql.Row;::::::::62028
[Lorg.apache.spark.sql.Row;::::::::62634
[Lorg.apache.spark.sql.Row;::::::::61838
[Lorg.apache.spark.sql.Row;::::::::61243
[Lorg.apache.spark.sql.Row;::::::::61585
样例:
--------------------100
[Lorg.apache.spark.sql.Row;::::::::61516
[Lorg.apache.spark.sql.Row;::::::::61656
[Lorg.apache.spark.sql.Row;::::::::61991
[Lorg.apache.spark.sql.Row;::::::::61269
[Lorg.apache.spark.sql.Row;::::::::61654
[Lorg.apache.spark.sql.Row;::::::::61780
最新文章
- plsql查找不到带中文的纪录
- [ZZ] Equal Error Rate (EER)
- 数组的join()函数操作
- Python第一个入门程序
- 如何便携使用github
- Robotium -- 针对apk包的测试
- 自定义一个searchBar
- Android学习笔记之View(二)
- HTTP协议中返回代码302的情况
- hibernate、easyui、struts2整合
- Debian7安装php5.5/5.6
- 内存管理单元--MMU
- NMS和soft-nms算法
- 51nod1185 威佐夫游戏 V2 (模拟乘法)
- Router components
- elasticSearch6源码分析(12)DiscoveryModule
- 817E. Choosing The Commander trie字典树
- UVALive 6044(双连通分量的应用)
- The promises and challenges of std::async task-based parallelism in C++11 C++11 std::async/future/promise
- java加密用到了BASE64Decoder时报错信息:Access restriction: The type BASE64Encoder is not accessible due to restrict