OpsMgr bug : console crashing when creating an override on cluster resource group monitor
Be aware that there is a bug when you want to create an override on the resource group rollup monitor to change the maintenance mode behavior!
Let’s see:
In my example I have one cluster with 2 cluster nodes. I needed to do some maintenance on my nodes and they needed to be rebooted. So I put one node in maintenance mode so that my NOC team will not get alerted, do what I had to do and finally restarted my node and do the same for the other node.
Suddenly I had someone of the NOC team yelling at me that they did receive an alert saying that there was a problem with the cluster resource group!!
I opened the cluster resource group health explorer and indeed:
The underlying resources were in maintenance mode but not the availability rollup monitor! So I had a look at the monitor and apparently the monitor is configured to rollup maintenance mode as an error:
So I wanted to change the maintenance mode parameter so that when I put a node into maintenance mode the monitor will rollup the maintenance mode as maintenance mode and not rollup the maintenance mode as an error:
And here is the catch:
When you want to save your override you will get this nice red error! So it’s impossible to change how the behavior of the rollup resource group monitor!
So now, when we need to put a cluster node in maintenance mode we also put the resource groups in maintenance mode.
Have fun,
Alexandre Verkinderen
Leave a comment