I have been working on many application during my career.  Many if not all had some searching capabilities.  The more complex the search got, the harder it was to control its performance and impact on database transactions.  If you also would like to support full text search, your problems become larger.  We could use Full text search capabilities in Oracle or SQL Server, but we would need to setup a separate instance if we want to limit impact on the transactions yet again.  Or we can pick another solution, better suited to solve the problem at hand.  This is where Elastic Search comes in.

It is a web services layer setup on top of Lucene, a search engine written in Java.  As a result, we will need to have Java Runtime installed on all machine where we are going to install Elastic Search, including developer machines.  You can download it from oracle web site, and it is free.  http://www.oracle.com/technetwork/java/javase/downloads/jre8-downloads-2133155.html.  I would recommend making sure you have 64 bit installed.

You can download Elastic Search at https://www.elastic.co/downloads/elasticsearch.  Let’s start by doing so. Once you have the zip file, we will install windows service that hosts Elastic Search on windows.  This is the easiest way to run Elastic on Windows.  Once you unzip the download, switch to bin folder using command window.  Then run service.bat to install windows service.  Just type the following

service install

If you get the following message: JAVA_HOME environment variable must be set.  We need to setup this environment variable.  Go to properties of “My Computer”.  Then select advanced system settings, then environment variables.  Add new system variable called JAVA_HOME.  It’s value should be something similar to “C:Program FilesJavajre1.8.0_74”.  Then run service install again.  You should see something like the following.  if so, you are done with step 1.

Next step is to start new .NET project.  I would pick class library to house our search code in.  We will test it using tests project.

So, start new class library project.  I’ll call mine Search.Library.  When we integrate Elastic with .NET, we should use .NET client library.  I use NEST, which is the best one in my opinion.  To install it use Nuget.  I would switch to NuGet package manager console window and type

install-package NEST

or you can use package manager window.  NEST will install ElasticSearch.Net package, one of its dependencies and JSON.NET.  We could then technically speaking start testing, but we need to take a few more steps in advance.

I have been using Elastic Search for a while.  In theory, you can use schema-less approach with it.  However, in practice, this does not work really well.  Schema has many advantages.  We can be very precise, especially for nullable data, when Elastic does not really know how to index the data.  So, if would apply default full text search approaches to all the data.  This may or may not be what we want.

In an example let’s define a class that we will use for samples.  Let’s call it Location, corresponding to a city.

namespace Search.Library { public class Location { public int CityId { get; set; } public string City { get; set; } public string Zip { get; set; } public string Type { get; set; } public string State { get; set; } public string County { get; set; } public string AreaCodes { get; set; } public double Latitude { get; set; } public double Longitude { get; set; } public string WorldRegion { get; set; } public string Country { get; set; } public int EstimatedPopulation { get; set; } public Coordinates Coordinates { get; set; } } }

Because we need to support geographic location, we defined a separate location type.

namespace Search.Library { public class Coordinates { public double Lat { get; set; } public double Lon { get; set; } } }

We need to configure the mappings in Elastic.  We may want to use free handed search on all fields, which in Elastic Search will be referred to as analyzed field.   Analyzed fields are broken into words then indexed for speedy word based search.  Say in the case of area codes we want to search similarly to LIKE ‘%%’ in SQL Server though.  Any such fields we need to flag as “not analyzed”.  In addition we may want to use custom analyzer to account for case sensitive search on not-analyzed fields.  We also want to use the same approach for all the fields that we want to use exact match on.  We should also think about primary keys.  In this case we want to flag CityId as id field in elastic search.  I feel that thinking about your mappings and queries upfront will save you some headaches down the road.

private void CreateMappings() { _client.Map<Location>(descriptor => { descriptor.Index(DefaultIndexName); descriptor.Properties(propertiesDescriptor => { propertiesDescriptor.Number(loc => loc.Name(location => location.CityId)); propertiesDescriptor.String(loc => loc.Name(location => location.City)); propertiesDescriptor.String(loc => loc.Name(location => location.Country)); propertiesDescriptor.String(loc => loc.Name(location => location.State)); propertiesDescriptor.String(loc => loc.Name(location => location.Type)); propertiesDescriptor.String(loc => loc.Name(location => location.Zip) .NotAnalyzed().Analyzer(LowerCaseAnalyzerName)); propertiesDescriptor.Number(loc => loc.Name(location => location.Latitude)); propertiesDescriptor.Number(loc => loc.Name(location => location.Latitude)); propertiesDescriptor.Number(loc => loc.Name(location => location.EstimatedPopulation)); propertiesDescriptor.GeoPoint(loc => { loc.Name(location => location.Coordinates); loc.LatLon(); return loc; }); return propertiesDescriptor; }); return descriptor; }); }

In the mapping creation above _client is an instance of ElasticClient.  Then we run through the property of our type, Location, and setup up each property.  In case of zip we set it up fot wild card search.  The reset of string properties are setup for stadard word base indexed search.  Finally, I setup location as type GeoPoint for spatial search.  We are going to run through the code in unit tests to make sure our mappings work Ok.

using Microsoft.VisualStudio.TestTools.UnitTesting; using Nest; namespace Search.Library.Tests { [TestClass] public class SearchTests { private ElasticClient _client; [TestInitialize] public void OnInit() { _client = new ElasticConfiguration().CreatElasticClient(); // just for testing. Should use custom index name. var indexExists = _client.IndexExists(new IndexExistsRequest(Indices.Parse(ElasticConfiguration.DefaultIndexName))); if (indexExists.Exists) { _client.DeleteIndex(new DeleteIndexDescriptor(Indices.Parse(ElasticConfiguration.DefaultIndexName))); } _client.Refresh(new RefreshRequest(Indices.All)); new ElasticConfiguration().SetupMappings(); } [TestMethod] public void Should_Create_Mappings() { var config = new ElasticConfiguration(); config.SetupMappings(); } [TestMethod] public void Should_Add_Data() { _client = new ElasticConfiguration().CreatElasticClient(); var loc = new Location { Type = "STANDARD", Coordinates = new Coordinates { Lat = 30, Lon = 40 }, Latitude = 30, CityId = 1, EstimatedPopulation = 23, State = "GA", City = "Atlanta", Zip = "30000", Country = "USA", AreaCodes = "33333 44444", County = "Gwinnett", Longitude = 40, WorldRegion = "North America" }; _client.Index(loc, descriptor => { descriptor.Index("default"); return descriptor; }); } } }

If we want to look at our mappings, we can easily do this in Chrome.  Go to extensions and search for “Sense”.  This will install Elastic Search plugin.  You can click on the plugin after that, and you will something similar to the following.

To look at the mappings, just type get _mapping and hit green arrow.

Our mappings look as follows.

{ "default": { "mappings": { "locations": { "properties": { "areaCodes": { "type": "string" }, "city": { "type": "string" }, "cityId": { "type": "double" }, "coordinates": { "type": "geo_point", "lat_lon": true }, "country": { "type": "string" }, "county": { "type": "string" }, "estimatedPopulation": { "type": "double" }, "latitude": { "type": "double" }, "longitude": { "type": "double" }, "state": { "type": "string" }, "type": { "type": "string" }, "worldRegion": { "type": "string" }, "zip": { "type": "string", "index": "not_analyzed", "analyzer": "customLowerCase" } } } } } }

We will discuss queries in subsequent posts.  You can download current project here.

最新文章

  1. 标准产品+定制开发:专注打造企业OA、智慧政务云平台——山东森普软件,交付率最高的技术型软件公司
  2. pymssql examples
  3. lifecycle of opensource products--x86-64
  4. ASP.NET MVC 4 如何避免数据库被自动创建或自动迁移
  5. 【Android】ADB常用指令与logcat日志(转)
  6. 如何迁移SharePoint 2010至新的站点
  7. [POJ] 3468 A Simple Problem with Integers [线段树区间更新求和]
  8. IOS设备型号(原创)
  9. Gradle的使用及下载
  10. luogu3646 巴厘岛的雕塑 (dp)
  11. search 重要文件路径 搜索【原】
  12. expdp和impdp导入和导出数据
  13. Docker 入门指南——常用命令
  14. jupyter notebook 初步使用配置调整
  15. eclipse代码提示javadoc背景为黑色框的解决办法
  16. LeetCode 617. Merge Two Binary Trees合并二叉树 (C++)
  17. jquery类似方法的比较(一)
  18. 算法笔记_133:最大连续乘积子数组(Java)
  19. [Django] 问题记录追踪表
  20. css3中的变形(transform)、过渡(transtion)、动画(animation)

热门文章

  1. css常用属性总结:背景background下篇
  2. Source命令及脚本的执行方式
  3. S EAI 客户主数据导入_test(detail)
  4. 配置ssh免密登录后,仍需要密码才能登陆其中某台机器
  5. C语言结合汇编开发系统内核
  6. 37-python中bs4获取的标签中如何提取子标签
  7. weblogic 初始化
  8. [SQL]查询数据库中具有某个字段名的表
  9. Python中numpy.apply_along_axis()函数的用法
  10. Undo Architecture