Views – Or How to Tame The Beast
One of the great things about OpsMgr is the sheer amount of management packs and rules available. This means that in a typical install there are over 6,000 rules looking at various aspect of the OS and applications. The problem is that this can create a significant amount of alerts. Tuning OpsMgr is the most important piece of work that is done post installation and it should be the first (major) part of a continuous tuning process. The way I tame the console is by creating views – quite a lot of views in fact. Especially during the first stage of major tuning.
The Operators Console is just a view onto the SQL OperationsManager database. In the Monitoring view you will see all alerts that have not been closed for the last x days where you can change x. Although this is the main view and the first view that many people go into it is the worst view to use when dealing with a large number of servers. My preference is to use the My Workspace tab and create specific views. These are very easy to setup and there are examples of specific views in the folders of the various MPs and that can be copied. In fact if you like a view right click it to get it to create a shortcut in My Workspace
The main view I used is based on New alerts (so that any other resolution state does not show up. If you are using the console you would tend to put alerts that are being worked on in another resolution state so that they do not show up in New. I create a number of different resolution states so that I can use those to help sort out what needs to be done. I create states like 2nd line support, Rule to be Investigated (whether rule needs to be disabled or changed. These can all be changed when it gets handed to operations as a working system.
I have this view sorted (nearly all columns in all alert views can be sorted by double clicking on the column header) by time created. Using Age does not seem as reliable as time created. That way the latest new alerts are always at the top and I can see what has been happening over the last few hours. For example in the last 8 hours 18 alerts show up in the visible part of this view but half of those are information or warning which means only 1 Critical alert per hour. This gives a view of how the estate is performing and it is useful to know when the estate is “normal” compared to when there are major problems. Anyone who has run a server estate will be familiar with this process as it is baselining the environment. Not to be confused with baseline monitors.
Monitors will close alerts when they see that the health is good. Unlike MOM 2005 these alerts get closed immediately so you may miss them. One of the key views I have is for auto resolved alerts.
- Go to My Workpace and right click on Favourite views.
- Chose New and Alert view.
- In Name type Auto Resolved Alerts.
- Tick the box for “Resolved by a specific user”
- Click “specific” and enter SYSTEM. This is not case sensitive.
- Additionally you can further filter by time (last 3 days for example) or by a group or targeted towards a specific entity.
- Click the Display tab.
- Resolution State column can be removed as all alerts in this view are Closed. Arrange to columns to your liking. Hint – in the view you can drag the columns around which is easier. I always remove the Group view as I don’t find it useful. I would add Path (which is server FQDN) as sometimes the source is cryptic like C: – not much use when you have 500 servers all with a C: drive. I also add the columns for Last Modified and Repeat count to most views as they are useful for deal with Rule Alerts.
- You now have a view that shows all auto resolved alerts.
Rule based alerts will not know how to resolve themselves. On a regular basis the console should be sorted by time Last Modified. If the alert has not been incremented for a number of days then a view can be taken that the problem is no longer active and those old alerts can be cleared. Also in order to keep the console clean when problems are fixed the associated alerts should be resolved. To stop spurious alerts coming into the console maintenance mode should be used for servers that are getting rebooted or worked on.
Other views I create
- Alerts Green (all information alerts)
- Alerts Amber (all warning alerts)
- Alerts Red (all critical error alerts)
- View for each resolution state that is created
- Alerts based on a source like Clusters or KCC
- Alerts based on a group like AD or Exchange
- Alerts based on a computer name when a computer is being troublesome
- Resolved alerts (not just by system) for the last day and 3 days
- Alerts based on the name of the alert if I am investigating a rule like Script Errors when there are a large number
- Alerts based on name like Windows Server has Restarted for information
- Computer view Servers in Maintenance Mode
- Computer view No Heartbeats for last 15 Mins
- Plus more as needed for events, performance or state.
Note that on some alert views I exclude Closed alerts but others (like based a computer) I usually leave in to get a full picture of what is going on. A tip is if you create a view to exclude only Closed alerts then be careful. If you just go in and tick the resolution state boxes apart from Closed you will not get alerts from a new resolution state that is created afterwards. In this case use the formula at the bottom instead and say less than 255 so that it will cover any new resolution state that is not Closed.
As views are easy and quick to create I generally add them as needed and remove them when no longer needed. These help filter down the amount of alerts seen in the console to a more manageable view and is something I always teach people how to do. Divide and conquer.
There are alerts that are constantly repeating and so resolving them does not help as they will be recreated. The choice with these alerts is to switch them off of they are not going to be acted on for ALL servers or fix the problem. There are alerts that are fleeting. They happen due to a server being rebooted or a server being temporarily busy. These need to be looked at to see if they are important but can be cleared after a few days if they do not increment. By ensuring that alerts are cleared regularly and the underlying problems are fixed will keep the alerts in the console down. Creating specific views (especially if using resolution states) will help focus on what needs to be looked at.
In summary if you create your own views you will find it easy to discover issues and focus on fixing them.