889-ElasticSearch入门
https://www.imooc.com/learn/889
简介:ElasticSearch是一个分布式、可扩展、实时的搜索与数据分析引擎,它能从项目一开始就赋予你的数据以搜索、分析和探索的能力。通过本课程的学习,你可以了解到,ElasticSearch在互联网行业里的火热程度,也可以了解到它的实际应用场景。本课程会通过理论与实践相结合的方式,带领你一步一步走进ElasticSearch的世界,使你轻松掌握ElasticSearch的基本概念,学习ElasticSearch的服务搭建,了解ElasticSearch的常用技巧,并通过案例项目让你拥有实际的应用能力。 老师实战课程已经上线:http://coding.imooc.com/class/167.html ElasticSearch+MySQL+Kafka强力组合,更有ES结合百度地图,Nginx等高级应用。
目录
第1章 课程介绍
介绍Elastic Search的由来,应用场景,课程需要的前置知识,环境要求,介绍课程安排
1-1 课程介绍
什么是ElasticSearch?
基于Apache Lucene构建的开源搜索引擎
Lucene相对复杂,需要深厚的搜索理论知识,难以集成
采用Java编写,提供简单易用的RESTFul API
轻松的横向扩展,可支持PB级的结构化或非结构化 数据处理
应用场景:
- 海量数据分析引擎
- 站内搜索引擎
- 数据仓库
一线公司实际应用场景:
- 英国卫报-实时分析公众对文章的回应
- 维基百科、Github-站内实时搜索
- 百度-实时日志监控平台
- 阿里巴巴、Google、京东等等
前置知识
- 熟悉用Maven构建项目
- 了解Spring Boot的基本使用
环境要求:
- IDE工具:Intellij IDEA、Eclipse等常用IDE即可
- Java:JDK1.8
- 其他依赖:Maven、NodeJS(6.0以上)
课程简介:
- 安装=》基础概念=》基本用法=》高级查询=》实战演练=》课程总结
第2章 安装
先从单个节点的安装讲起,然后把插件安装流程走一遍,并说一下插件的主要作用,最后把分布式安装简单的介绍一下
2-1 ES版本历史和选择
- 版本历史 1.x -> 2.x -> 5.x
- 版本选择 5.x以上
安装
- 单实例安装
- 实用插件Head安装
- 分布式安装
2-2 单实例安装
# 下载
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.3.1-linux-x86_64.tar.gz
# 解压
tar -zvxf elasticsearch-7.3.1-linux-x86_64.tar.gz
# 进入HOME目录
cd elasticsearch-7.3.1-linux-x86_64
# 运行ElasticSearch程序
sh ./bin/elasticsearch
# 获取ElasticSearch服务信息
curl http://localhost:9200/
2-3 插件安装
# 下载
wget https://github.com/mobz/elasticsearch-head/archive/master.zip
# 解压
unzip master.zip
# 进入HOME目录
cd elasticsearch-head-master
# 编译源码
npm install
# 运行程序
npm run start
# 访问服务地址
open http://localhost:9100/
修改ElasticSearch配置
vim config/elasticsearch.yml
http.cors.enabled: true http.cors.allow-origin: "*"
重启ElasticSearch、Head
2-4 分布式安装
修改ElasticSearch配置
主节点:
vim config/elasticsearch.yml # -import ~~master~~ # 查看进程 ps -ef | grep `pwd` # 后台启动ElasticSearch ./bin/elasticsearch -d
# ~~master~~ cluster.name: wali node.name: master node.master: true network.host: 127.0.0.1
从节点
# 创建子节点目录 mkdir es_slave # 拷贝ElasticSearch cp elasticsearch-7.3.1-linux-x86_64.tar.gz es_slave # 解压 tar -zvxf elasticsearch-7.3.1-linux-x86_64.tar.gz # 复制两份子节点程序 cp -r elasticsearch-7.3.1-linux-x86_64 es_slave1 cp -r elasticsearch-7.3.1-linux-x86_64 es_slave2 cd es_slave1 vi config/elasticsearch.yml # -import ~~es_slave1~~ # 后台启动ElasticSearch ./bin/elasticsearch -d cd es_slave2 vi config/elasticsearch.yml # -import ~~es_slave2~~ # 后台启动ElasticSearch ./bin/elasticsearch -d
# ~~es_slave1~~ cluster.name: wali node.name: slave1 network.host: 127.0.0.1 http.port: 8001
# ~~es_slave2~~ cluster.name: wali node.name: slave2 network.host: 127.0.0.1 http.port: 8002
第3章 基础概念
结合Head插件,简单的讲解集群、节点的概念,重点讲解索引及其重要性,分片和备份属于索引的附属概念,一笔带过,类型、文档简单讲解下,并将索引、类型、文档三者的关系讲解清楚
3-1 基础概念
集群和节点:
- 节点1+节点2+…+…+节点N ==》集群
基础概念
索引:含有相同属性的文档集合
类型:索引可以定义一个或多个类型,文档必须属于一个类型
文档:文档是可以被索引的基本数据单位
分片:每个索引都有多个分片,每个分片是一个Lucene索引
备份:拷贝一份分片就可以完成了分片的备份
第4章 基本用法
本章节主要带领学生一起学习ElasticSearch的基本使用方法,并了解常用的REST API,讲解过程中要将命令行与head插件相结合以演示相应操作。
4-1 索引创建
- 非结构化创建
- 结构化创建
RESTFul API:
API基本格式
http://<ip>:<port>/<索引>/<类型>/<文档ID>
常用HTTP动词:GET/PUT/POST/DELETE
Head插件创建索引(非结构化):
访问Head服务地址:http://127.0.0.1:9100
索引-新建索引
概览
粗线框为主分片,旁边细线框为分片备份
Head插件创建索引(结构化):
复合查询-查询
参数信息
http://127.0.0.1:9200/ book/novel/_mappings POST
{ "novel": { "properties": { "title":{ "type": "text" } } } }
验证JSON-提交请求
概览-刷新-信息-索引信息-mappings
Postman创建索引
PUT http://127.0.0.1:9200/people
{ "settings": { "number_of_shards": 3, "number_of_replicas": 1 }, "mappings": { "man": { "properties": { "name": {"type": "text"}, "country": {"type": "keyword"}, "age": { "type": "integer" }, "date": { "type": "date", "format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis" } } }, "woman": { } } }
4-2 插入
- 指定文档ID插入
- 自动产生文档ID插入
文档ID:唯一索引值,指向文档数据
指定文档ID插入:
PUT http://127.0.0.1:9200/people/man/1
{ "name": "瓦力", "country": "China", "age": 30, "date": "1987-03-07" }
概览-docs:索引下所有文档的数量值
数据浏览
自动产生文档ID插入:
POST http://127.0.0.1:9200/people/man
{ "name": "瓦力", "country": "China", "age": 30, "date": "1987-03-07" }
4-3 修改
- 直接修改文档
- 脚本修改文档
直接修改文档
POST http://127.0.0.1:9200/people/man/1/_update
{ "doc": { "name": "谁是瓦力" } }
脚本修改文档
POST http://127.0.0.1:9200/people/man/1/_update
{ "script": { "lang": "painless", "inline": "ctx._source.age += 10" } }
{ "script": { "lang": "painless", "inline": "ctx._source.age += params.age", "params": { "age": 100 } } }
4-4 删除
- 删除文档
- 删除索引
删除文档
DELETE http://127.0.0.1:9200/people/man/1
删除索引
Head插件删除
概览-动作-删除-输入删除-确定
Postman删除
DELETE http://127.0.0.1:9200/people
删除索引是一个非常危险的操作。
4-5 查询
- 简单查询
- 条件查询
- 聚合查询
事先创建索引book:
PUT http://127.0.0.1:9200/book
{
"settings": {
"number_of_shards": 3,
"number_of_replicas": 1
},
"mappings": {
"novel": {
"properties": {
"word_count": {
"type": "integer"
},
"author": {
"type": "keyword"
},
"title": {
"type": "text"
},
"publish_date": {
"format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis",
"type": "date"
}
}
}
}
}
插入数据:
PUT http://127.0.0.1:9200/book/novel/1
{
"word_count": "5000",
"author": "王五",
"title": "菜谱",
"publish_date": "2002-10-01"
}
book novel 5 1 王五 菜谱 5000 2002-10-01
book novel 8 1 瓦力 ElasticSearch入门 3000 2017-08-20
book novel 9 1 很胖的瓦力 ElasticSearch精通 3000 2017-08-15
book novel 10 1 牛魔王 芭蕉扇 1000 2000-10-01
book novel 2 1 李三 Java入门 2000 2010-10-01
book novel 4 1 李四 ElasticSearch大法好 1000 2017-08-01
book novel 6 1 赵六 剑谱 10000 1997-01-01
book novel 1 1 张三 移魂大法 1000 2000-10-01
book novel 7 1 张三丰 太极拳 1000 1997-01-01
book novel 3 1 张四 Python入门 2000 2005-10-01
book novel 11 1 孙悟空 七十二变 1000 2000-10-01
简单查询:
GET http://127.0.0.1:9200/book/novel/1
条件查询:
查询所有
POST http://127.0.0.1:9200/book/_search
{ "query": { "match_all": {} }, "form": 1, "size": 1 }
关键词查询
{ "query": { "match": { "title": "ElasticSearch" } }, "sort": [ {"publish_date": {"order": "desc"}} ] }
条件查询
POST http://127.0.0.1:9200/book/_search
{ "aggs": { "group_by_word_count": { "terms": { "field": "word_count" } }, "group_by_publish_date": { "terms": { "field": "publish_date" } } } }
聚合查询
POST http://127.0.0.1:9200/book/_search
{ "aggs": { "grades_word_count": { "stats": { "field": word_count } } } }
{ "aggs": { "grades_word_count": { "min": { "field": word_count } } } }
第5章 高级查询
将query与filter分开讲解,并把相应的关键词列举一些,结合实例操作,最后将query与filter组合起来,讲解实例
5-1 query
高级查询:
- 子条件查询:特定字段查询所指特定值
- 复合条件查询:以一定的逻辑组合子条件查询
子条件查询
- Query context
- Filter context
Query Context
在查询过程中,除了判断文档是否满足查询条件外,ES还会计算一个_score来标识匹配的程度,旨在判断目标文档和查询条件匹配的有多好。
常用查询:
- 全文本查询:针对文本类型数据
- 字段级别查询:针对结构化数据,如数字、日期等
POST http://127.0.0.1:9200/book/_search
模糊匹配(匹配:ElasticSearch+入门)
{ "query": { "match": { "title": "ElasticSearch入门" } } }
习语匹配(匹配:ElasticSearch入门)
{ "query": { "match_phrase": { "title": "ElasticSearch入门" } } }
多字段模糊匹配
{ "query": { "multi_match": { "query": "瓦力", "fields": ["author", "title"] } } }
语法匹配
{ "query": { "query_string": { "query": "ElasticSearch AND 大法" } } }
{ "query": { "query_string": { "query": "(ElasticSearch AND 大法) OR Python" } } }
多字段语法匹配
{ "query": { "query_string": { "query": "瓦力 OR ElasticSearch", "fields": ["title", "author"] } } }
字段匹配
{ "query": { "term": { "word_count": 1000 } } }
{ "query": { "term": { "author": "瓦力" } } }
范围匹配(数字、日期)
{ "query": { "range": { "word_count": { "gte": 1000, "lte": 2000 } } } }
{ "query": { "range": { "publish_date": { "gt": "2019-08-29", "lte": "now" } } } }
now:代表当前时间
5-2 filter
Filter Context
在查询过程中,只判断该文档是否满足条件,只有Yes或者No
POST http://127.0.0.1:9200/book/_search
{
"query": {
"bool": {
"filter": {
"term": {
"word_count": 1000
}
}
}
}
}
filter做数据过滤的,并且会对查询结果进行缓存,所以filter比query要快一些。
5-3 复合查询
常用查询:
- 固定分数查询
- 布尔查询
POST http://127.0.0.1:9200/book/_search
{
"query": {
"match": {
"title": "ElasticSearch"
}
}
}
{
"query": {
"constant_score": {
"filter": {
"match": {
"title": "ElasticSearch"
}
},
"boost": 2
}
}
}
{
"query": {
"bool": {
"should": [
{
"match": {
"author": "瓦力"
}
},
{
"match": {
"title": "ElasticSearch"
}
}
]
}
}
}
{
"query": {
"bool": {
"must": [
{
"match": {
"author": "瓦力"
}
},
{
"match": {
"title": "ElasticSearch"
}
}
],
"filter": [
{
"term": {
"word_count": 1000
}
}
]
}
}
}
{
"query": {
"bool": {
"must_not": {
"term": {
"author": "瓦力"
}
}
}
}
}
固定分数查询不支持match,只支持filter
第6章 Spring Boot集成ES
直接演示SpringBoot集成ES,做接口开发
以下使用ElasticSearch7.3.1做演示
6-1 SpringBoot集成ElasticSearch
实战演练:
- Spring Boot集成ES
- 图书信息管理接口开发
Pom.xml
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>${elasticsearch.version}</version>
</dependency>
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-high-level-client</artifactId>
<version>${elasticsearch.version}</version>
</dependency>
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-client</artifactId>
<version>${elasticsearch.version}</version>
</dependency>
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-client-sniffer</artifactId>
<version>${elasticsearch.version}</version>
</dependency>
配置Spring Bean
@Configuration
public class ESConfig {
@Bean
public RestHighLevelClient client() throws UnknownHostException {
RestHighLevelClient client = new RestHighLevelClient(
RestClient.builder(
new HttpHost("127.0.0.1", 9200, "http")));
return client;
}
}
ESController
@RestController
@RequestMapping("/novel")
public class ESController {
@Autowired
private RestHighLevelClient client;
// todo
}
运行
mvn spring-boot:run
6-2 查询接口开发
接口开发:
- 新增图书信息功能开发
- 修改图书信息功能开发
- 删除功能开发
- 综合查询接口开发
@GetMapping("/{id}")
public ResponseEntity get(@PathVariable(name = "id") String id) {
if (id.isEmpty()) {
return new ResponseEntity(HttpStatus.NOT_FOUND);
}
GetRequest novelRequest = new GetRequest("novel", id);
try {
GetResponse novelResponse = this.client.get(novelRequest, RequestOptions.DEFAULT);
if (novelResponse.isExists()) {
System.out.println(novelResponse);
return new ResponseEntity(novelResponse.getSource(), HttpStatus.OK);
}
} catch (IOException e) {
e.printStackTrace();
return new ResponseEntity(HttpStatus.INTERNAL_SERVER_ERROR);
}
return new ResponseEntity(HttpStatus.NOT_FOUND);
}
运行:
mvn spring-boot:run
GET http://127.0.0.1:8080/novel/1
6-3 增加接口开发
@PostMapping("")
public ResponseEntity add(@RequestBody Novel novel) {
try {
Gson gson = new Gson();
Map novelMap = gson.fromJson(gson.toJson(novel), Map.class);
IndexRequest novelRequest = new IndexRequest("novel").id(UUID.randomUUID().toString()).source(novelMap);
IndexResponse novelResponse = this.client.index(novelRequest, RequestOptions.DEFAULT);
System.out.println(novelResponse);
return new ResponseEntity(novelResponse.getId(), HttpStatus.OK);
} catch (IOException e) {
e.printStackTrace();
return new ResponseEntity(HttpStatus.INTERNAL_SERVER_ERROR);
}
}
@Data
public class Novel {
private String title;
private String author;
@JsonProperty(value = "word_count")
private String wordCount;
@JsonProperty(value = "publish_date")
@DateTimeFormat(pattern = "yyyy-MM-dd")
private Date publishDate;
}
运行
POST http://127.0.0.1:8080/novel
{
"title": "大话西游",
"author": "刘镇伟",
"word_count": "1008",
"publish_date": "2019-09-08"
}
6-4 删除接口开发
@DeleteMapping("/{id}")
public ResponseEntity delete(@PathVariable(name = "id") String id) {
DeleteRequest novelRequest = new DeleteRequest("novel", id);
try {
DeleteResponse novelResponse = this.client.delete(novelRequest, RequestOptions.DEFAULT);
System.out.println(novelResponse);
return new ResponseEntity(novelResponse.getResult().toString(),HttpStatus.OK);
} catch (IOException e) {
e.printStackTrace();
return new ResponseEntity(HttpStatus.INTERNAL_SERVER_ERROR);
}
}
运行
DELETE http://127.0.0.1:8080/novel/1
6-5 更新接口开发
@PutMapping("/{id}")
public ResponseEntity update(@PathVariable(name = "id") String id, @RequestBody Novel novel) {
try {
Gson gson = new Gson();
Map map = gson.fromJson(gson.toJson(novel), Map.class);
UpdateRequest novelRequest = new UpdateRequest("novel", id).doc(map);
UpdateResponse novelResponse = this.client.update(novelRequest, RequestOptions.DEFAULT);
System.out.println(novelResponse);
return new ResponseEntity(novelResponse.getResult().toString(), HttpStatus.OK);
} catch (IOException e) {
e.printStackTrace();
return new ResponseEntity(HttpStatus.INTERNAL_SERVER_ERROR);
}
}
运行
PUT http://127.0.0.1:8080/novel/1
{
"title": "大话西游",
"author": "刘镇伟",
"word_count": "1008",
"publish_date": "2019-09-08"
}
6-6 复合查询接口开发
@GetMapping("")
public ResponseEntity query(@RequestParam(value = "author", required = false) String author,
@RequestParam(value = "title", required = false) String title,
@RequestParam(value = "gt_word_count", defaultValue = "0") Integer gtWordCount,
@RequestParam(value = "lt_word_count", required = false) Integer ltWordCount) {
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
if (author != null) {
boolQueryBuilder.must(QueryBuilders.matchQuery("author", author));
}
if (title != null) {
boolQueryBuilder.must(QueryBuilders.matchQuery("title", title));
}
RangeQueryBuilder wordCountRangeQuery = QueryBuilders.rangeQuery("word_count")
.from(gtWordCount);
if (ltWordCount != null && 0 < ltWordCount) {
wordCountRangeQuery.to(ltWordCount);
}
boolQueryBuilder.filter(wordCountRangeQuery);
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder()
.query(boolQueryBuilder)
.from(0)
.size(10);
SearchRequest novelSearchRequest = new SearchRequest("novel")
.source(searchSourceBuilder)
.searchType(SearchType.DFS_QUERY_THEN_FETCH);
System.out.println(novelSearchRequest.source());
try {
SearchResponse novelSearchResponse = this.client.search(novelSearchRequest, RequestOptions.DEFAULT);
System.out.println(novelSearchResponse);
ArrayList<Map<String, Object>> result = new ArrayList<>();
novelSearchResponse.getHits().forEach(e -> {
result.add(e.getSourceAsMap());
});
return new ResponseEntity(result, HttpStatus.OK);
} catch (IOException e) {
e.printStackTrace();
return new ResponseEntity(HttpStatus.INTERNAL_SERVER_ERROR);
}
}
第7章 课程总结
课程总结
7-1 课程总结
ES简介
ES使用场景例子
ES的火热程度-重要性
安装
ES使用插件HEAD
ES分布式安装,实现了ES的横向扩容
基础概念
基本用法
增删查改功能
高级查询
实战演练
将ES集成到SpringBoot
转载请注明来源,欢迎对文章中的引用来源进行考证,欢迎指出任何有错误或不够清晰的表达。可以在下面评论区评论,也可以邮件至 tuyrk@qq.com
文章标题:889-ElasticSearch入门
文章字数:4k
本文作者:神秘的小岛岛
发布时间:2019-08-13, 00:25:32
最后更新:2020-04-03, 13:59:59
原始链接:https://www.tuyrk.cn/imooc/889-ElasticSearch/版权声明: "署名-非商用-相同方式共享 4.0" 转载请保留原文链接及作者。