1、首先将GEOIP放到服务器上,如,/opt/db/geo/GeoLite2-City.mmdb

2、新建scala sbt工程,测试是否可以顺利解析

import java.io.File
import java.net.InetAddress
import com.maxmind.db.CHMCache
import com.maxmind.geoip2.DatabaseReader
import org.json4s.DefaultFormats /**
* Created by zxh on 2016/7/17.
*/
object test {
implicit val formats = DefaultFormats def main(args: Array[String]): Unit = {
val url = "F:\\Code\\OpenSource\\Data\\spark-sbt\\src\\main\\resources\\GeoLite2-City.mmdb"
// val url2 = "/opt/db/geo/GeoLite2-City.mmdb"
val geoDB = new File(url);
geoDB.exists()
val geoIPResolver = new DatabaseReader.Builder(geoDB).withCache(new CHMCache()).build();
val ip = "222.173.17.203"
val inetAddress = InetAddress.getByName(ip)
val geoResponse = geoIPResolver.city(inetAddress)
val (country, province, city) = (geoResponse.getCountry.getNames.get("zh-CN"), geoResponse.getSubdivisions.get(0).getNames().get("zh-CN"), geoResponse.getCity.getNames.get("zh-CN")) println(s"country:$country,province:$province,city:$city")
}
}
build.sbt 内容如下
import AssemblyKeys._
assemblySettings
mergeStrategy in assembly <<= (mergeStrategy in assembly) { mergeStrategy =>
{
case entry => {
val strategy = mergeStrategy(entry)
if (strategy == MergeStrategy.deduplicate) MergeStrategy.first
else strategy
}
}
}
assemblyOption in assembly := (assemblyOption in assembly).value.copy(includeScala = false)
name := "scala_sbt"
version := "1.0"
scalaVersion := "2.10.4"
libraryDependencies += "com.maxmind.geoip2" % "geoip2" % "2.5.0"

  将该程序打包,放到服务器上,执行scala -cp ./scala_sbt-assembly-1.0.jar test,解析结果如下

country:中国,province:山东省,city:济南

3、编写streaming程序

import java.io.File
import java.net.InetAddress import com.maxmind.db.CHMCache
import com.maxmind.geoip2.DatabaseReader
import com.maxmind.geoip2.model.CityResponse
import org.apache.spark.rdd.RDD
import org.apache.spark.streaming.{Time, Seconds, StreamingContext}
import org.apache.spark.{SparkContext, SparkConf} /**
* Created by zxh on 2016/7/17.
*/
object geoip { def main(args: Array[String]): Unit = {
val conf = new SparkConf().setAppName("geoip_test").setMaster("local[2]")
val sc = new SparkContext(conf)
val ssc = new StreamingContext(sc, Seconds(10))
val lines = ssc.socketTextStream("localhost", 9999) lines.foreachRDD((rdd: RDD[String], t: Time) => {
rdd.foreachPartition(p => {
val url2 = "/opt/db/geo/GeoLite2-City.mmdb"
val geoDB = new File(url2);
val geoIPResolver = new DatabaseReader.Builder(geoDB).withCache(new CHMCache()).build(); def resolve_ip(resp: CityResponse): (String, String, String) = {
(resp.getCountry.getNames.get("zh-CN"), resp.getSubdivisions.get(0).getNames().get("zh-CN"), resp.getCity.getNames.get("zh-CN"))
} p.foreach(x => {
if (x != None && x != null && x != "") {
val inetAddress = InetAddress.getByName(x)
val geoResponse = geoIPResolver.city(inetAddress)
println(resolve_ip(geoResponse))
}
})
})
}) ssc.start
}
}
build.sbt libraryDependencies += "com.maxmind.geoip2" % "geoip2" % "2.5.0"

注意:红色部分需要放到foreachPartition内部,原因如下:

1、减少加载文件次数,一个Partition只加载一次

2、resolve_ip 函数参数为CityResponse,此参数不可序列化,所以要在Partition内部,这样就不会在节点之间序列化传输

3、com.maxmind.geoip2 版本需要是 2.5.0,以便和spark本身兼容,否则会报错如下:

val geoIPResolver = new DatabaseReader.Builder(geoDB).withCache(new CHMCache()).build();
java.lang.NoSuchMethodError: com.fasterxml.jackson.databind.node.ArrayNode.<init>(Lcom/fasterxml/jackson/databind/node/JsonNodeFactory;Ljava/util/List;)V

最新文章

  1. php打印中文乱码
  2. su root 和su - root 的区别
  3. 虚拟机Linux----Ubuntu1204----root登录设置
  4. MVC3中,在control里面三种Html代码输出形式
  5. 软件工程课程作业(三)--四则运算3(C++)
  6. iOS开发之Runloop(转)
  7. Cocoapods 64-bit(iPhone5s) 问题解决方案
  8. css3实现手机菜单展开收起动画
  9. hql中in的用法
  10. swift UI特殊培训38 与滚动码ScrollView
  11. 完美解决ie浏览器location.href不刷新页面的问题,进入页面只刷新一次
  12. Unity 读写文本 文件
  13. ionic BUILD FAILED
  14. restframework细节学习
  15. Java字符串占位符(commons-text)替换(转载)
  16. android手机抓wireshark包的步骤-tcpdump(需root权限)
  17. Linux(CentOS7.0)下 C访问MySQL (转)
  18. [转]OpenStack Neutron解析
  19. Jena 操作 RDF 文件
  20. eclipse中怎样添加项目至SVN资源库

热门文章

  1. 微信小程序 --- 获取网络状态
  2. Mybatis框架插件PageHelper的使用
  3. mysql数据库新插入数据,需要立即获取最新插入的id
  4. T-SQL创建作业
  5. Redis讲解
  6. 关于java web的笔记2018-01-12
  7. HI3518E用J-link烧写裸板fastboot u-boot流程
  8. CMDB内功心法,助我登上运维之巅
  9. Struts2的ActionContext
  10. java爬取网页内容 简单例子(2)——附jsoup的select用法详解