๐ ๏ธ ๋งคํ + Analyzer + ๊ฒ์ ์ฟผ๋ฆฌ ์ข ํฉ ์ค์ต
์ด ๋ฌธ์๋ Elasticsearch์์
posts
์ธ๋ฑ์ค๋ฅผ ๋์์ผ๋ก
๋งคํ ์ค๊ณ, custom analyzer ๊ตฌ์ฑ, ๊ฒ์ ์ฟผ๋ฆฌ ์ค์ต์ ์ข ํฉ์ ์ผ๋ก ์ฐ์ตํ๊ธฐ ์ํ ์ค์ ์์ ์ ๋๋ค.
1. ๐ฏ ๋ชฉํ ์์ฝ
- ์ปค์คํ
analyzer๋ฅผ ์ ์ฉํ
posts
์ธ๋ฑ์ค ์์ฑ title
,tags
,content
ํ๋๋ฅผ ์ ์ ํ text/keyword๋ก ๊ตฌ์ฑ- ๊ฒ์ ์ฟผ๋ฆฌ์์ match, filter, sort, search_after ๋ฑ ์ค์ ํ์ฉ
2. โ๏ธ ์ปค์คํ Analyzer + ๋งคํ ์์
PUT http://localhost:9200/posts
{
"settings": {
"analysis": {
"tokenizer": {
"my_nori_tokenizer": {
"type": "nori_tokenizer",
"decompound_mode": "mixed"
}
},
"filter": {
"my_pos_filter": {
"type": "nori_part_of_speech",
"stoptags": ["E", "IC", "J"]
},
"my_stop_filter": {
"type": "stop",
"stopwords": ["์", "๋", "์ด", "๊ฐ"]
}
},
"analyzer": {
"my_korean_analyzer": {
"type": "custom",
"tokenizer": "my_nori_tokenizer",
"filter": [
"lowercase",
"my_pos_filter",
"my_stop_filter"
]
}
}
}
},
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "my_korean_analyzer",
"fields": {
"raw": { "type": "keyword" }
}
},
"tags": {
"type": "keyword"
},
"content": {
"type": "text",
"analyzer": "my_korean_analyzer"
},
"author": {
"type": "keyword"
},
"created_at": {
"type": "date"
}
}
}
}
3. ๐ฆ ์์ ๋ฌธ์ ์ฝ์ (Bulk API)
POST http://localhost:9200/posts/_bulk
{ "index": {} }
{ "title": "์นด์นด์คํ์ด ์๋น์ค ์ถ์", "tags": ["fintech", "kakao"], "content": "์นด์นด์ค์ ์๋ก์ด ๊ฐํธ๊ฒฐ์ ์๋น์ค๊ฐ ์์๋์์ต๋๋ค.", "author": "alice", "created_at": "2024-01-01T10:00:00Z" }
{ "index": {} }
{ "title": "๋ค์ด๋ฒ ์ง๋ ์
๊ทธ๋ ์ด๋", "tags": ["map", "naver"], "content": "์ง๋ ๊ฒ์์ด ๋ ๋น ๋ฅด๊ณ ์ ํํด์ก์ต๋๋ค.", "author": "bob", "created_at": "2024-02-15T12:30:00Z" }
{ "index": {} }
{ "title": "๊ตฌ๊ธ ๊ฒ์ AI ๋์
", "tags": ["ai", "google"], "content": "AI๋ก ๋ ๋๋ํ ๊ฒ์์ด ๊ฐ๋ฅํด์ก์ต๋๋ค.", "author": "charlie", "created_at": "2024-03-10T15:00:00Z" }
4. ๐ ๊ฒ์ ์ฟผ๋ฆฌ ์์
๐ฏ 4-1. ์นด์นด์ค
๋ฅผ ํฌํจํ๊ณ fintech ํ๊ทธ๋ฅผ ๊ฐ์ง ๋ฌธ์ ๊ฒ์
POST http://localhost:9200/posts/_search
{
"query": {
"bool": {
"must": [
{ "match": { "content": "์นด์นด์ค" } }
],
"filter": [
{ "term": { "tags": "fintech" } }
]
}
}
}
๐งญ 4-2. ์ ๋ ฌ + search_after๋ฅผ ์ฌ์ฉํ ํ์ด์ง ๊ฒ์
POST http://localhost:9200/posts/_search
{
"size": 2,
"query": {
"match_all": {}
},
"sort": [
{ "created_at": "desc" },
{ "_id": "asc" }
],
"search_after": ["2024-01-01T10:00:00Z", "doc_id_2"]
}
๐ 4-3. title ์ ๋ ฌ (multi-field ์ฌ์ฉ)
POST http://localhost:9200/posts/_search
{
"query": {
"match": {
"title": "๊ฒ์"
}
},
"sort": [
{ "title.raw": "asc" }
]
}
โ ์์ฝ
ํญ๋ชฉ | ์ค๋ช |
---|---|
custom analyzer ์ค์ | nori tokenizer + ๋ถ์ฉ์ด + ํ์ฌ ํํฐ ์ ์ฉ |
multi-field ๊ตฌ์ฑ | title์ text + keyword ํจ๊ป ์ฌ์ฉ (raw ์ ๋ ฌ์ฉ) |
๊ฒ์ ์ฟผ๋ฆฌ ํ์ฉ | match, filter, sort, search_after ์กฐํฉ |
Bulk ์ฝ์ | ๋๋ ๋ฐ์ดํฐ ํ ์คํธ๋ฅผ ์ํ ๋น ๋ฅธ ์ฝ์ |