需求:统计每日销售额


package wujiadong_sparkSQL import org.apache.spark.sql.types._
import org.apache.spark.sql.{Row, SQLContext}
import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.sql.functions._ /**
* Created by Administrator on 2017/3/6.
*/
object DailySale {
def main(args: Array[String]): Unit = {
val conf = new SparkConf().setAppName("dailysale").setMaster("local")
val sc = new SparkContext(conf)
val sqlContext = new SQLContext(sc)
import sqlContext.implicits._
//模拟数据
val userSalelog = Array(
"2017-02-01,55,1122",
"2017-02-01,23,1133",
"2017-02-01,15,",
"2017-02-02,56,1155",
"2017-02-01,78,1123",
"2017-02-01,113,1144"
) val userSalelogRDD = sc.parallelize(userSalelog,2)
val filteredUserRDD = userSalelogRDD.filter(log => if(log.split(",").length == 3) true else false)
val RowRDD = filteredUserRDD.map(log => Row(log.split(",")(0),log.split(",")(1).toInt,log.split(",")(2).toInt))
val schema = StructType(
Array(
StructField("date",StringType,true),
StructField("sale_amount",IntegerType,true),
StructField("userid",IntegerType,true)
)
) val df = sqlContext.createDataFrame(RowRDD,schema) df.groupBy("date")
.agg('date,sum('sale_amount))
.map(row => Row(Row(row(0),row(2))))
.collect()
.foreach(println) } }

运行结果


hadoop@master:~/wujiadong$ spark-submit --class wujiadong_sparkSQL.DailySale --executor-memory 500m --total-executor-cores 2 /home/hadoop/wujiadong/wujiadong.spark.jar
17/03/06 20:55:20 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/03/06 20:55:21 WARN SparkConf:
SPARK_CLASSPATH was detected (set to ':/home/hadoop/bigdata/hive/lib/mysql-connector-java-5.1.26-bin.jar').
This is deprecated in Spark 1.0+. Please instead use:
- ./spark-submit with --driver-class-path to augment the driver classpath
- spark.executor.extraClassPath to augment the executor classpath 17/03/06 20:55:21 WARN SparkConf: Setting 'spark.executor.extraClassPath' to ':/home/hadoop/bigdata/hive/lib/mysql-connector-java-5.1.26-bin.jar' as a work-around.
17/03/06 20:55:21 WARN SparkConf: Setting 'spark.driver.extraClassPath' to ':/home/hadoop/bigdata/hive/lib/mysql-connector-java-5.1.26-bin.jar' as a work-around.
17/03/06 20:55:23 INFO Slf4jLogger: Slf4jLogger started
17/03/06 20:55:23 INFO Remoting: Starting remoting
17/03/06 20:55:24 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@192.168.1.131:58765]
17/03/06 20:55:25 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
17/03/06 20:55:26 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.
[[2017-02-01,269]]
[[2017-02-02,56]]
17/03/06 20:55:51 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
17/03/06 20:55:51 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.

最新文章

  1. Asp.Net Mvc + ComBoost.Mvc快速开发
  2. 【C++ STL编程】queue小例子
  3. 越狱后想禁用Spotlight
  4. 【转】利用 Bootstrap 进行快速 Web 开发
  5. node 裁剪图片
  6. Linux下设置允许myql数据库远程连接
  7. IDF实验室解题学习笔记1
  8. iOS 关于时间戳的一些细节
  9. Largest Rectangle in Histogram 解答
  10. Log4Net五部曲
  11. 【java 多线程】多线程并发同步问题及解决方法
  12. Tensorflow timeline trace
  13. 主元素问题 Majority Element
  14. maven项目里的mapper不被加载,解析
  15. bzoj千题计划125:bzoj1037: [ZJOI2008]生日聚会Party
  16. Java精选笔记_面向对象(构造方法、this关键字、static关键字、内部类)
  17. linux下升级npm以及node
  18. android Studio 运行不显示avd 无法运行
  19. 〖Linux〗使用root权限,telnet登录开发板
  20. 2017年上海金马五校程序设计竞赛:Problem B : Sailing (广搜)

热门文章

  1. Zipline Data Bundles
  2. Django REST framework 理解
  3. Service Receiver Activity 之间的通信
  4. 0602-Zuul构建API Gateway-Zuul Http Client、cookie、header
  5. 19.如何在vue里面调用其他js
  6. 把typora改为微软雅黑+Consolas
  7. MySQL整理(三)
  8. APP移动端自动化测试工具选型“兵器谱”一览(主流开源工具)
  9. JAVA集合详解(Collection和Map接口)
  10. Windows server 2003 伪静态配置方法