Our database is distributed over 3 partitions equally and mirroring is not enabled. What will happen if one node dies? Will the manager try and recover it on another node?
In this case 3 nodes without mirroring means that you have 3 partitions. If one partition fails, it depends on settings (config.xml), what will happen:
- if max_failing_partitions == 0 then: Eafis stops processing queued requests and waits until the partition becomes active again. When failed partition starts back, it has to be loaded with missing templates/user records from DB. So load of missing templates from DB starts. When it is finished, Eafis continues to process queued messages again.
- if max_failing_partitions == 1 then: Eafis stops processing queued requests and starts to load missing templates from DB into memory of available partitions (proportionately). When it is finished, Eafis continues to process queued messages again. It has to be ensured that missing templates will fit into memory of available partitions. When failed partition starts back after its failure is fixed, templates are migrated from those partitions via load balancing feature (LB). Eafis cluster is fully operating during LB, but overall throughput is decreased until LB is completed.
- you will have 4 configured partitions, but max_live_partition == 3: when 1 partition fails, so there are only 2 live partitions left. Eafis stops processing queued requests, it starts Matcher process (INode) on 4th partition and starts to load missing templates from DB into memory of that backup partition. When it is finished, Eafis continues to process queued messages again.
NOTE: in all these cases max_failing_matchers is set to -1, what means that Matcher count check is disabled (we are interested only in partition count, because in this case no mirroring is used).