In a 2-node HA cluster without quorum server, since neither of the cluster nodes (partitions) has more than 50% votes in case of split-brain, we should configure corosync.conf
to enable two_ndoe: 1
, so that both nodes (partitions) will be granted "quorum". But there'll be potential fencing matches which could introduce double-fencing by chances.
The current solution is using random/static fencing delays by configuring pcmk_delay_max/base
parameters for stonith resources to prevent double-fencing.
Still, an interesting further improvement could be, how to make sure the more significant node that is hosting the more significant resources or instances (promoted) wins fencing matches.
An good idea is:
Involve the existing
priority
meta-attribute for resources. Users setpriority
for the significant resources that they want to matter. When fencing is needed, cluster scheduler will calculate nodes' priority respectively by summing up the priority of the resources/instances being hosted on them. Promoted instances take a little higher priority (+1) than the base priority on calculation. So specifically for the instances belonging to a same promotable clone resource, they get different priority values like "Promoted > Started > Stopped".A cluster-wide
priority-fencing-delay
gets applied to the node with the highest total resource priority. It's added as a parameter of the fencing action in the cluster transition graph and passed to fenced by controld.
More details could be found: https://confluence.suse.com/pages/viewpage.action?spaceKey=hateam&title=More+significant+node+wins+fencing+match+under+2-node+split-brain
Looking for hackers with the skills:
Nothing? Add some keywords!
This project is part of:
Hack Week 19