Elasticsearch Administration

  • monitoring without alerting is equivalent to no monitoring
  • alerting with too much noise is equivalent to no alerting
  • despite all efforts, there may still be false positives (noise), true negatives (uncaught incidents) in your alert rules. To build a robust system, include monitoring/alerting in your postmortems, identify what’s missing in your monitoring, and fill in the gap.
groups:
- name: ElasticSearch
rules:
- alert: UnassignedShards
expr: elasticsearch_cluster_health_unassigned_shards > 0
- alert: ClusterRed
expr: elasticsearch_cluster_health_status{color="red"} == 1
- alert: JVMUsage
expr: (elasticsearch_jvm_memory_used_bytes/elasticsearch_jvm_memory_max_bytes) > 0.9
- alert: HealthyNodes
expr: elasticsearch_cluster_health_number_of_nodes < 3
- alert: NumberOfPendingTasks
expr: elasticsearch_cluster_health_number_of_pending_tasks > 0
- alert: ElasticSearchUsedFS
expr: (elasticsearch_filesystem_data_size_bytes - elasticsearch_filesystem_data_free_bytes) / elasticsearch_filesystem_data_size_bytes * 100 > 90
curl -X GET "localhost:9200/_cluster/health?pretty"
curl -X GET "localhost:9200/_cluster/allocation/explain?pretty"
curl -XPOST 'localhost:9200/_cluster/reroute?retry_failed'

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
One9twO

One9twO

A pragmatic programmer with a rubber duck.