Elasticsearch 的坑爹事
本文记录一次Elasticsearch mapping field修改过程
团队使用Elasticsearch做日志的分类检索分析服务,使用了类似如下的_mapping{ "settings" : { "number_of_shards" : 20 }, "mappings" : { "client" : { "properties" : { "ip" : { "type" : "long" }, "cost" : { "type" : "long" },}
现在问题来了,日志中输出的"127.0.0.1"这类的IP地址在Elasticsearch中是不能转化为long的(报错Java.lang.NumberFormatException),所以我们必须将字段改为string型或者ip型(Elasticsearch支持, 数据类型可见)才能达到理想的效果.目标明确了,就是改掉mapping的ip的field type即可.elasticsearch.org找了一圈 嘿嘿, update一下即可
curl -XPUT localhost:8301/store/client/_mapping -d '{ "client" : { "properties" : { "local_ip" : {"type" : "string", "store" : "yes"} } }}
{"error":"MergeMappingException[Merge failed with failures {[mapper [local_ip] of different type, current_type [long], merged_type [string]]}]","status":400}
curl -XPUT localhost:8305/store_v2 -d '{ "settings" : { "number_of_shards" : 20 }, "mappings" : { "client" : { "properties" : { "ip" : { "type" : "string" }, "cost" : { "type" : "long" },}
import pyes conn = pyes.es.ES("http://10.xx.xx.xx:8305/") search = pyes.query.MatchAllQuery().search(bulk_read=1000) hits = conn.search(search, 'store_v1', 'client', scan=True, scroll="30m", model=lambda _,hit: hit) for hit in hits: #print hit conn.index(hit['_source'], 'store_v2', 'client', hit['_id'], bulk=True) conn.flush()
花了大概一个多小时,新的索引基本和老索引数据一致了,对于线上完成瞬间的增量,这里没心思关注了,数据准确性要求没那么高,得过且过。接下来修改alias别名的指向(如果你之前没有用alias来改mapping,纳尼就等着哭吧)
curl -XPOST localhost:8305/_aliases -d '{ "actions": [ { "remove": { "alias": "store", "index": "store_v1" }}, { "add": { "alias": "store", "index": "store_v2" }} ]}'
啷啷锵锵,正在追数据中
等新索引的数据已经追上时
将老的索引删掉
curl -XDELETE localhost:8303/store_v1
至此完成!
一件如此简单的事情,Elasticsearch居然能让他变得如此复杂,真是牛逼啊...