In a 2-node HA cluster without quorum server, since neither of the cluster nodes (partitions) has more than 50% votes in case of split-brain, we should configure corosync.conf to enable two_ndoe: 1, so that both nodes (partitions) will be granted "quorum". But there'll be potential fencing matches which could introduce double-fencing by chances.

The current solution is using random/static fencing delays by configuring pcmk_delay_max/base parameters for stonith resources to prevent double-fencing.

Still, an interesting further improvement could be, how to make sure the more significant node that is hosting the more significant resources or instances (promoted) wins fencing matches.

An good idea is:

  • Involve the existing priority meta-attribute for resources. Users set priority for the significant resources that they want to matter. When fencing is needed, cluster scheduler will calculate nodes' priority respectively by summing up the priority of the resources/instances being hosted on them. Promoted instances take a little higher priority (+1) than the base priority on calculation. So specifically for the instances belonging to a same promotable clone resource, they get different priority values like "Promoted > Started > Stopped".

  • A cluster-wide priority-fencing-delay gets applied to the node with the highest total resource priority. It's added as a parameter of the fencing action in the cluster transition graph and passed to fenced by controld.

More details could be found: https://confluence.suse.com/pages/viewpage.action?spaceKey=hateam&title=More+significant+node+wins+fencing+match+under+2-node+split-brain

Looking for hackers with the skills:

Nothing? Add some keywords!

This project is part of:

Hack Week 19

Activity

  • about 4 years ago: yfjiang liked this project.
  • about 4 years ago: yan_gao started this project.
  • about 4 years ago: yan_gao originated this project.

  • Comments

    Similar Projects

    This project is one of its kind!