1 minute read

Be aware that there is a bug when you want to create an override on the resource group rollup monitor to change the maintenance mode behavior!

big_ugly_bug

Let’s see:

In my example I have one cluster with 2 cluster nodes. I needed to do some maintenance on my nodes and they needed to be rebooted. So I put one node in maintenance mode so that my NOC team will not get alerted, do what I had to do and finally restarted my node and do the same for the other node.

 

Suddenly I had someone of the NOC team yelling at me that they did receive an alert saying that there was a problem with the cluster resource group!!

 

I opened the cluster resource group health explorer and indeed:

image

The underlying resources were in maintenance mode but not the availability rollup monitor! So I had a look at the monitor and apparently the monitor is configured to rollup maintenance mode as an error:

 

image

So I wanted to change the maintenance mode parameter so that when I put a node into maintenance mode the monitor will rollup the maintenance mode as maintenance  mode and not rollup the maintenance mode as an error:

image

And here is the catch:

image

 

When you want to save your override you will get this nice red error! So it’s impossible to change how the behavior of the rollup resource group monitor!

 

So now, when we need to put a cluster node in maintenance mode we also put the resource groups in maintenance mode.

 

Have fun,

Alexandre Verkinderen

Leave a comment