0

I'm trying to understand how replicas work in Elasticsearch.

enter image description here

This is a very basic question, but can someone explain why one node is not just all replicas? Surely in this example, if one node fails 50% of the shards will be lost?

F.D
  • 103
  • 1

1 Answers1

0

In your example, there are two nodes.

The data is represented at 4 data set shards: A, B, C, D.

Both nodes hold all 4 data sets, with each node as primary writable shard for 2 data sets, and secondary (read only) replica shards for 2 data sets.

This allows you to spread workload across both nodes, with each node serving half of the workload. In the event if a node failure, the secondary replica data sets can be made primary on the other node, with a single node serving 100% of the workload. There is no data loss because A, B, C, & D exist on both nodes.

AMtwo
  • 16,141
  • 1
  • 32
  • 61
  • You may want to clarify that this is specifically to distribute writes, because they happen on primary shards first. Reads could be distributed even if there were only one node holding all primary shards. – mustaccio Mar 06 '21 at 18:44