I'm trying to understand how replicas work in Elasticsearch.
This is a very basic question, but can someone explain why one node is not just all replicas? Surely in this example, if one node fails 50% of the shards will be lost?
I'm trying to understand how replicas work in Elasticsearch.
This is a very basic question, but can someone explain why one node is not just all replicas? Surely in this example, if one node fails 50% of the shards will be lost?
In your example, there are two nodes.
The data is represented at 4 data set shards: A, B, C, D.
Both nodes hold all 4 data sets, with each node as primary writable shard for 2 data sets, and secondary (read only) replica shards for 2 data sets.
This allows you to spread workload across both nodes, with each node serving half of the workload. In the event if a node failure, the secondary replica data sets can be made primary on the other node, with a single node serving 100% of the workload. There is no data loss because A, B, C, & D exist on both nodes.