SCOM 2007 Architecture
One of the interesting things about System Center Operations Manager 2007 is that it is not just MOM v3. But it is also Corporate Error Reporting (CER) v3 (now called Agentless Exception Monitoring (AEM)) and Microsoft Audit Collection Services (MACS or sometimes ACS). The original beta of MACS was called Distributed Audit Database (DAD) so you could have MOM and DAD. Less said about that the better. While MACS was never released as a product the beta has been deployed by a number of organisations. It is not in the beta 2 build of SCOM 2007. So SCOM 2007 is actually three products rolled into one. All three have different histories and different ways of working. This reminds me of the early versions of Office. This should make the architecture of SCOM interesting though.
MOM was designed to monitor event logs and performance counters and create an alert. So it was designed not to collect data but to filter it. The exception was the collecting of data for reports and that went into the data warehouse. One of the new features of SCOM is that data goes straight to the data warehouse from the
OM server and does not go into the SCOM operations database which means you no longer have that pesky DTS job to take the data out of one database to put in another. This job was run by that well know enterprise tool – Scheduled Tasks in the Accessories part of the Windows menu! If that job did not run then grooming in MOM 2005 would not take place. The other advantage is that reports will be more up to date rather than waiting for the overnight DTS job. SCOM requires SQL 2005 as well as a number of other prerequisites:-
- Windows 2003 SP1
- WinFX runtime (beta 2 at the moment)
- MSXML 6.0 (installed by setup)
- .Net Framework v2
- MDAC 2.8
- Powershell (in beta at the moment)
CER was designed to help organisations collect data to analyse Dr Watson errors. CER v2 was available to Software Assurance customers only. Rather than the user sending the Dr Watson errors to Microsoft they would be sent to a local server and a policy could be put in place that would allow this to happen automatically without user intervention. Something that Microsoft would not be allowed to do. Voluntarily the organisation could still send the data to Microsoft for analysis which would help Microsoft develop fixes quicker. In return the organisation would get back information of fixes to known problems. The data is stored on a local file share. And this is still the case with AEM except that the data must be stored on a SCOM server so that it can be analysed and reported on. The problem being is that you can not create the file share using DFS for fault tolerance on multiple SCOM servers as each SCOM server would analyse each file share. So if you had 2 SCOM servers it would look like you had twice as many errors and if you had 3 SCOM servers then it would look like you had three times as many errors. Good if you want to show your boss you need to get more budget to fix the problems but otherwise a pain. The other option is to create a DFS share on a non SCOM server so that if the SCOM server does go down for some reason then AEM data is still gathered. At the moment there is no official word on whether DFS would be supported for this scenario but I can not see how it would be a problem.
Most people would have never come across MACS although there were presentations at one TechEd that I know of but it has been around in Microsoft for a while. There were a number of discussions on how it was going to be released and in what format before they decided to bundle it as part of SCOM 2007. The reason for MACS is to help in the wake of legislation like Sarbanes Oxley (SOX) which affects more than just
US firms. Although there is HIPPA, Basel II and others as well that have regulatory requirements for some organisations. MACS collects security events (and no other events) from servers and workstations (only in SCOM is workstations seriously considered for MOM type monitoring which also may affect the design and scalability of the system) in a secure manner and transports them to a SQL database for storage and analysis. The point was to have separation from the IT team who are generally administrators on most systems but if it is bundled with SCOM then those are the very people that MACS was supposed to be separated from.
MACS is different from MOM as it was designed to collect events. Also while a MOM event is usually actionable as an event (e.g. disk almost full, SMTP queue over a certain threshold, service stopped etc) security events really need to be correlated and it is the unique information inside the event that is important rather than the actual event per se. MACS does not store the whole event but the unique bits with pointers for efficiency. One of the key findings from the beta is that the database can be a bottleneck due to the amount of insertions. In which case would you want that database on the same server as your
OM database monitoring your key systems? Initially MACS was designed as the collection and storage system and it was going to be up to third parties and organisations to create front ends and reports. Whether this is still the case now it is part of SCOM I do not know but organisations that are used to MOM and out of the box reports will not be pleased if there are no reports or at least some samples.
At this stage there is no guide from Microsoft on the best practices of setting up SCOM 2007 with all three components. That type of information will come later. But you have one component (AEM) that stores information on a file share that then gets analysed and put in a SQL database.
OM itself stores its data in a SQL database but is designed to be groomed.
OM also has the data warehouse for long term storage and MACS has its database for storage but is optimised for security events. And this is before you even take into account the requirements for fault tolerance and failover. And then if you have SMS and want
OM, SMS and AD to feed into System Center Reporting Manager you will need to work out the best way to do that. And that is before sending the information to a service desk or a manager of managers. As I said designing a SCOM 2007 architecture should be interesting. I just hope that Microsoft publishes some good guidelines and best practices when it is released.
- Posted in: System Center Operations Manager 2007