Troubleshooting #SCOM Agent – High CPU Load for System Center Management Service Host Process

Recently we noticed an unusual high CPU load on one Domain Controller. – The SCOM Agent was occupying nearly all resources. – This short blog post shows how to troubleshoot and fix issues like this.


High CPU load on one Domain Controller. – Taskmanager exposed that  System Center Management Service Host Process alternating with MonitoringHost.exe took nearly all.

System Center Management Service in TaskMgr


Start up the good ol’ SCOM console, within the Monitoring section beneath the Operations Manager / Agent Details folder there is a view called Agents by Version.

There, search for the machine and hit on Show Failed rules and Monitors for this Health Service

Console OM AgentsByVersion

A task now collects the information in the background. This can take a few minutes.

In the Task Output Window hundreds of Workload Ids with the type of Microsoft.Windows.Server.DNS.2012R2.Monitor.DNSSEC.NameResolutionqueries appeared.

Details about failing rules


After checking the details of the failing Monitor we disabled it. No direct benefit of this particular alert was identified.

Within the Authoring section under Management Pack Objects, Monitors, search for the monitor name and disable it for the class. – Remember to save change the setting in a dedicated Override Management Pack and not the default MP!

Disabling unrequired Monitor

After waiting 5 minutes re-running the Show Failed rules and Monitors for this Health Service did not show any error more.

No more failing workflows

The CPU load on that server returned to normal.


To be honest, the steps above are a work-around. – As this particular monitor did not bring any value, a simply disabling was the best solution.


Fixing Hybrid – IaaS with Azure Update Management and SCOM

Azure Update Management (AUM) is a free service that helps to deploy patches on servers running in Azure and On Premises (in your datacenter).  It provides basic capabilities, but enough to control the whole patch process.

AUM and OpsMgr

While evaluating AUM on a Windows Server 2019 hosted on Azure I noticed that either monitoring with SCOM or patching via AUM worked. – The mom agent, which needs to contact AUM and SCOM could only contact one destinations at the same time.

Within the Log Analytics workspace the following error was show:

“VM has reported a failure when processing extension ‘MicrosoftMonitoringAgent’. Error message:” This machine is already connected to another Log Analytics workspace, or managed by System Center Operations Manager. Please set stopOnMultipleConnections to false in public settings or remove this property, so this machine can connect to new workspaces.”

Required steps to fix in brief

To solve this issue for the VM proceed with the following steps.

  1. Gather this information: Workspace ID, Workspace Key, VM Name, Location and Resource Group Name
  2. Connect to Cloud Shell
  3. Run some PowerShell to set the stopOnMultipleConnections flag to false.
  4. Activate AUM or restart the SCOM agent if the management server was already entered.

Note: The Azure portal is using lots of JavaScript, HTML and other web technologies. I suggest using Microsoft’s Edge browser.


Steps in detail

Search for Log Analytics and click on Virtual Machines to find the problematic VM:

Locating correct Log Analytics Workspace

Choose Advanced Settings

Select Advanced Settings

On Connected sources, note the Workspace ID and the Primary Key (Workspace Key)

Note values for WorkSpaceID and WorkspaceKey ( Primary ID )

Start the Cloud shell and get virtual machine details as mentioned above.

Start Azure Cloud Shell and get VM details

Use a text editor (e.g. notepad++) and prepare use following code based on the values collected above.


$PublicSettings = @{"workspaceId" = "c94e5249-e224…";"stopOnMultipleConnections" = $false}
$ProtectedSettings =@{'workspaceKey' = 'FwxRLqbRg9/…'}

Set-AzVMExtension -ResourceGroupName "rsg-wegc-commontest-server" `
 -VMName "vm-WEGCXX0001" `
-Publisher Microsoft.EnterpriseCloud.Monitoring `
-ExtensionType MicrosoftMonitoringAgent  `
-TypeHandlerVersion 1.0 `
-Settings $PublicSettings `
-ProtectedSettings $ProtectedSettings `
-Location "West Europe" `
-Name MicrosoftMonitoringAgent

Copy the code into the clipboard and paste it into the Cloud Shell. Confirm with Return.

Copy the code into the clipboard and paste it into the Cloud Shell. Confirm with Return.

Verify that communication with SCOM and AUM works

Start a RDP session, open the control panel and launch the MoM Agent.

Verify MoM Agent - OperationsManager

Verify MoM Agent - Log Analytics

The configuration on the VM looks healthy now.

Within the SCOM Console the server is shown and as fully monitored
Verify VM in Operations Manager Console

Next steps

To ensure that these steps are performed automatically on server creation it make sense to add those steps to an ARM template.

A good starting point provide this link: by @KasunSJC

Silect Daily Checks Dashboards

First class analysis of your SCOM environment


Silect’s dashboards expose Operation Manager’s health and key performance indicators at a glance.

They help to easily understand which data is collected, how much they consume and what are the noisiest elements.

Various causes of SCOM performance issues can be identified with a single click!

Note: The dashboards only require Power BI Desktop (free!) to gather their information directly from your SCOM environment.


Management Packs (MPs) make SCOM aware of specific workload that should be monitored. MPs exist for Operating Systems, Active Directory, VMWare, Databases and even Azure and Office 365. Software vendors and independent authors offer them either for free or for an affordable price.

By importing a Management Pack new objects are created in SCOM. Classes which are blueprints of the items that should be monitored, monitors which represent the health state, rules which record performance information or warn about mis-configuration and many more.

While usually the agent on the monitored computer does the checking via the code that comes inside the MPs, the Management Server receives the data and forwards them to its Databases.

With the Management Servers’ Console or the Web interface it let’s you check the received information from agents, offers you tools to create own monitoring settings, displays performance graphs and others. Management Servers also do a bunch of calculations in the background.

Some Management Packs are also designed to run their checks only on the Management Server for various reasons.

Each Management Pack creates additional work for the Management Server.

Such only MPs for workloads that are important to your environment should be imported.

Furthermore, for those workloads only rules and monitors shall be enabled that make sense must be turned on. – Others to be disabled.


Main section

Maintenance is one important aspect of operating a SCOM environment.
The SCOM administrator takes care that the system itself is healthy, helps that its users are not overwhelmed with alerts and ensures that it responds in a timely manner.

The Dashboards

Ten screens present different aspects of your OpsMgr environment. 

Note: The screenshots have been taken from a customer with 650 Windows Servers and 20 Linux machines. This helps to better present 😊


Agents Overview

Agent Overview

Scans the individual server against the classes (blueprints), measures counters and performs calculations defined in the MP and sends them to the Management Servers.

  • – Unhealthy Agents, Unknown State Agents (grey color in SCOM)
  • – Pending Management Agents
  • – Agents without Failover
  • – Total Agents number

-> Agents in unknown state and unhealthy ones cannot deliver data so need to be fixed!



OperationsManager DB

SCOM’s working database, stores short-term information and configuration. – It should be always enough space available.
Grooming takes care that old data is deleted, and that data is aggregated and moved to the correct tables for long term storage

  • – Data- and Log file size, free space and space unused
  • – Grooming tasks

-> Estimate the size in the next few months and change the size instead of using AutoGrow!
-> Ocassional failures can be ignored. Enduring ones must to be fixed to avoid endless DB growing and performance issues.



SCOM’s database for long term information. Especially used when using the reports. – It should be always enough space available.

  • Data- and Log file size, free space and space unused


OM DB Backup


SCOM’s database backup state. In case of a disaster the database backup is required to get the system back. It should be performed daily.

  • – Time Taken and the Backup Size is shown in relation.
  • – The slicer and the drop-down menu allow the specification of the date range.


OM DW DB Backup

As previous screen just for the datawarehouse database.


Config Churn


Objects that represent the items which are monitored by SCOM are made of classes which are defined in the MP. Classes have attributes (properties) that will store values. This screen is dedicated to the property changes.

  • – Changes per Class
  • – Changes per Management Pack
  • – Changes on timeline
  • – Changes by Object Name
  • – Changes by Property System Name

-> A high amount of changes has dramatical impact to SCOM performance and shall be avoided.
-> In this case DNS Zone class, more specifically its PrimaryServerName attribute is updated too frequently. => As not required in this environment, the DNZ zone discovery has been disabled to improve the performance.


Events in Operations Manager DB


Events are written, mostly for informational purposes without the direct need of action. – In other cases, they are similar to debug messages that developers write to track their program code behavior in detail.

  • – Events per Event ID – % Total
  • – Events per Event ID
  • – Events per Logging Computer
  • – A table containing details of the Event for overview

-> Events shall be checked from time to time, if no value can be found the corresponding rules shall be disabled. Less information written to the DB results in better performance!


Monitor State Changes – last 24 hours


Monitors determine the health state of an object. They are either green (healthy), yellow (warning) or red (critical) state. State changes is the event when the monitor changes an object from one color to another.

  • – Monitor State Changes per Server
  • – Monitor State Changes per Monitor Name
  • – Monitor State Changes per Server – % Total
  • – Monitor State Changes per Monitor Name – %Total

-> As most Monitors automatically switch to healthy once the good condition is identified, it happens that some problems keep unidentified. That must not be a problem. Checking this screen can help to discover such issues and can help to perform measures until they become a problem.
-> Another reason for too frequent state changes are default thresholds that do not fit to your requirement. – An override to the right level fixes the problem permanently.


OM DB and OM DW Tables and Datasets


As SCOM stores all its information in databases, those grow over time. This screen gives you an idea what kind of data is stored in the DB and how much it is.

  • – TotalSpace by Tablename (Database)
  • – TotalSpace by Tablename (Datawarehouse Database)
  • – OperationsManager Tables (Overview table)

-> First part to control the data storage requirements is by changing the retention of the different information types in in the Operations Manager console (Administration – Settings – Database Grooming).
-> Second part is changing the grooming settings via tool named DWDATARP.

=> For both: Store the data only as long as required.


Performance Collection – Last 24 hours


Collecting performance data over time helps to understand an applications behavior, allow prediction for capacity management and help to spot bugs like memory leaks.
In SCOM, rules do the job of collecting those metrics and they are store in Operations Manager database for short and for long term in the datawarehouse db.

This screen gives insights how many samples are collected by which rule.

  • – Filter for Counter Name, Object Name or Rule Name
  • – TotalSamples by ObjectName
  • – TotalSamples by CounterName

-> To not waste disk space and keep the database performing well, disable rules that collect data you don’t need.



  • – Refresh Date
    • – – When the last data has been queried from the SCOM environment
  • – Publish Date
    • – – When these dashboards were released for testing.



Silect’s Dashboards significantly improve regular checks. They cover all important aspects that shall be taken care of and represent them easy to consume.

Most important; they let you handle the job quick and with fun 😊

Without the dashboards manual checking is required. Using SQL Management Studio to check the database, running reports to identify objects which frequent state changes or using the SCOM console to get an overview of agents. Some information need SQL commands that can’t be easily.

These dashboards are a must have for every committed SCOM admin.
Great product!

Further reading

Prepared by Ruben, 2020-02-11

System Center Operations Manager (SCOM): Monitoring Citrix XenApp XenDesktop 7.18 – By Stoyan Chalakov

If you’ve tried or trying to monitor the new Citrix XenApp and XenDesktop 7.18 product version with the Citrix SCOM Management Pack for XenApp and XenDesktop, you are familiar with it’s challenges. Our good friend Stoyan faced the similar challenge recently, and to give everyone out there a helping hand, he noted his solution down for everyone to refer to! Here’s how Stoyan solved the challenge:


All of you, who are using this management pack know that the latest version available at the time of writing (v3.14) supports max. Citrix XenApp and XenDesktop 7.16. The thing is that Citrix released their most recent version of XenApp and XenDesktop (7.18) a couple of months ago.

So if you have done an upgrade to this version without coordinating this your Operations Team first, then most certainly you are in a situation, where your Citrix XAXD environment is not monitored by SCOM. So, how to solve this? Before answering this, let’s us first shortly describe how the management pack discovers the XAXD Delivery Controller.

In order for the management pack to identify a particular server as Citrix XAXD Delivery Controller and create an instance of this class (ComTrade.Citrix.XenDesktop.DeliveryController.ComputerRole.Discovery ) in SCOM, the management pack does a registry based discovery by checking the value of the following registry key on your delivery controller:

SOFTWARE\Microsoft\Windows\CurrentVersion\Uninstall\Citrix Desktop Delivery Controller\VersionMajor

If you have done an upgrade and you are in this situation, then the value of the “VersionMajor” key should be either “1808” or “1903”.

I personally contacted Citrix many times regarding this topic and asked about the progress on the development of the new management pack version. Unfortunately, I was told that is not sure when it will be released, nobody could give any estimation. Unable to find detailed information on the discovery I gave up looking for a solution.



Shortly after the last Contact to Citrix, while answering SCOM related questions on the Microsoft Social Technet Forums I came across a thread, describing the exact same problem and introducing possible workaround:

Citrix XAXD management pack

It seems that the registry discovery run by the management pack uses a GreaterEqual expression to check the value of the VersionMajor registry key and the value does not match the predefined values for the previous XAXD Versions, the discovery discards the results. That being said, the workaround would be to edit the value of the key and enter a value, corresponding to the older versions, which on its turn should allow the discovery to run fine, identify the managed system as XAXD Delivery Controller and create an instance of the ComTrade.Citrix.XenDesktop.DeliveryController.ComputerRole.Discovery class.

According to the post the value, which can be used to trigger the discovery of the delivery controller is “7”.


Test results and important notes

Besides the user, who presented this in the forum (huge thanks for this), who confirmed that the workaround works just fine for the “1903” version, I can also confirm that it works with the “1808” version also. There were no side effects of changing the value of the key on the delivery controller and the XAXD environment in general. Still, please be aware of the following important points:

  • First of all, this only a workaround, not a real solution the problem. Considering this you need to be very careful with changing the value of the registry key and remember that this on your own risk.
  • Always backup your registry first before doing any changes. This is very important rule and fully applies also in this case.
  • Make sure you revert the values of the VersionMajor registry key before doing further upgrades or uninstalling Xen App XenDesktop.
  • Make sure you revert the key back to its original value also before upgrading the Citrix SCOM Management Pack for XenApp and XenDesktop, in case a new version is released.



The Microsoft System Center Operations Manager Community is very comprehensive and is an origin of many helpful ideas and solutions. The Microsoft SCOM Social Technet Forum in particular is a place, where you can seek help for technical problems, but is also a source of wide range of information on important and interesting topics. This particular one is the best example for this. Hopefully it will help you out in getting your Citrix XAXD 7.18 environment monitored again!




Most Awesome Tools For IT Administrators

Here is a collection of awesome tools for IT administrators

Contributions are always welcome! We accept proprietary and commercial software too.

Thanks all! you’re awesome and wouldn’t be possible without you! The goal is to build a categorized community-driven collection of very well-known resources.

SCOM Troubleshooting: SQL Monitoring failed – Error after installing SCOM MP 7.0.15 for MS SQL – By Ruben Zimmermann

In this troubleshooting tip, Ruben talks about fixing the error you might encounter after you install the Management Pack for SQL version 7.0.15.


After we updated the MS SQL Management Pack to 7.0.15 several servers threw alerts that monitoring isn’t working any more.

The error message stated that a WMI query did not return any value.


Shortly after we updated the Management Pack to 7.0.15 we received error messages from a couple of SQL Servers.

The WMI query that is triggered by the DiscoverSQL2012DBEngineDiscovery.vbs did not return any valid instances.

Furthermore, on the SQL Server the SQL Server Configuration Manager could not be started anymore and terminated with the error that it could not connect to the WMI  provider.


Rebuilding provider on the affected SQL Servers.

  1. Start the command prompt (cmd)
  2. Navigate to (cd) “C:\ProgramFiles(x86)\MicrosoftSQLServer\110\Shared\” where 110 depends on the g the SQL server version installed on that machine
  3. Run the following command: mofcomp sqlmgmproviderxpsp2up.mof

Hope this helps!


SCOM Troubleshooting – Missing Reports

Recently I was deploying a new SCOM 2019 management group. I had Kevin’s SCOM 2019 deployment guide on the side for reference and to organize a general flow of the process. Everything was going great, until I came to the part of installing the reporting server. Now, I was following the guide closely and had all the best practices for permissions for all my SCOM accounts. But, still I was having a hard time installing reporting.

I installed SSRS, validated the URL, all was good. Then when I was running the reporting server setup, it was taking a long time and would finally fail. Now this was just for POC and I figured it was better to just start this piece again instead of spending time with the troubleshooting, looking into logs and stuff. So, that’s what I did. Cleaned up everything, uninstalled SSRS and ran the installer again. Well, this time again it took a lot of time, but was actually successful. But when I went into reporting tab and validate the installation, I saw this:

Hmm…not good. I see a bunch of reports missing here. So I started looking into the logs. And indeed, this is what I found:

Type: Error

Event ID: 26319 

User: N/A

Computer: Computername
Description: An exception was thrown while processing GetUserRolesForOperationAndUser for session id uuid:UUID. Exception Message: Access is denied. (Exception from HRESULT: 0x80070005 (E_ACCESSDENIED)) Full Exception: System.UnauthorizedAccessException: Access is denied. (Exception fro HRESULT: 0x80070005 (E_ACCESSDENIED))

Exception Message: Access is denied. (Exception from HRESULT: 0x80070005 (E_ACCESSDENIED)) Full Exception: System.UnauthorizedAccessException: Access is denied. (Exception from HRESULT: 0x80070005 (E_ACCESSDENIED)) at Microsoft.Interop.Security.AzRoles.IAzApplication2.InitializeClientContextFr omStringSid(String SidString, Int32 lOptions, Object varReserved) at Microsoft.EnterpriseManagement.Mom.Sdk.Authorization.AzManHelper.GetScopedRo leAssignmentsForUser(IList`1 roleNames, String userName) at Microsoft.EnterpriseManagement.Mom.Sdk.Authorization.AuthManager.GetUserRole sForOperationAndUser(Guid operationId, String userName) at Microsoft.EnterpriseManagement.Mom.ServiceDataLayer.SdkDataAccess.GetUserRol esForOperationAndUser(Guid operationId, String userName) at Microsoft.EnterpriseManagement.Mom.ServiceDataLayer.SdkDataAccessTieringWrap per.GetUserRolesForOperationAndUser(Guid operationId, String userName) at Microsoft.EnterpriseManagement.Mom.ServiceDataLayer.SdkDataAccessExceptionTr acingWrapper.GetUserRolesForOperationAndUser(Guid operationId, String userName)

No worries though, a quick search brought me this:

When you try to install Microsoft System Center Operations Manager 2007 Reporting, the installation is unsuccessful

Now this link talks about issues with OS Windows 2000, but I was using Windows 2016, so I should not be technically facing this issue. However, since the accounts I was using were local accounts and not domain admins I still decided to try it out and add the SDK (DAS account in SCOM 2019) account in the Windows Authorization Access Group, which actually seemed to work. Slowly but steadily, all the reports started appearing. Problem solved! 😉

Hope this helps!


Automate SQL Express Instance removal from SCOM with PowerShell – Ruben Zimmermann

Ruben is back again with another Powershell banger! This time he presents you a Powershell script that will automatically detect and remove the SQL Express Instances from SCOM monitoring and save you from unnecessary overhead of removing them manually!


SQL Express Databases are a widely used storage for settings in applications or as data storage for small amount of data. Except of backups in cases it is not required to manage those databases.

The MS SQL Server Management Pack for SCOM discovers any edition. Thus, we can spot Express databases from the SCOM Console.

Unfortunately, the Management Pack can’t monitor Express databases and lots of unfixable alerts are thrown.

Manual solution

As described by Tim McFadden here

or by  Kevin Holman here

It is possible to either set filter strings to prevent the discovery for all Express instances by name.

This does not work if the Express is named as MSSQLSERVER.

MSSQLSERVER is also the default for SQL Standard and other editions.

Only choice then is to override object by object manually, or?

PowerShell solution

With a bit of PowerShell it is possible to override the discovery rules for Express editions no matter which name they have. – Put this script into your regular maintenance scripts to keep your SCOM free from Express instances:

# Defining Override Management Pack. - It needs to be created before starting.

$overrideMP = Get-SCOMManagementPack -DisplayName 'Custom.SQL.Server.Express.Removals'

# Get all Windows Servers Instances, needed as lookup to disable the specific computer object

$winSrvClass     = Get-SCOMClass -Name Microsoft.Windows.Server.Computer

$winSrvInstances = Get-SCOMClassInstance -Class $winSrvClass

# Get Express instances For SQL 2005 to SQL 2012

$classSQL2to8     = Get-SCOMClass -Name Microsoft.SQLServer.DBEngine

$instancesSQL2to8 = Get-SCOMClassInstance -Class $classSQL2to8

$expressSQL2to8   = $instancesSQL2to8 | Where-Object {$_.'[Microsoft.SQLServer.DBEngine].Edition'.Value  -eq 'Express Edition' }

$computersSQL2to8 = $expressSQL2to8.path

# Finding the computer objects which host SQL 2005 to SQL2012, store them in ArrayList

$targetComputers = New-Object -TypeName System.Collections.ArrayList

$computersSQL2to8 | ForEach-Object {

    $tmp   = ''

    $check = $false

    $check = $winSrvInstances.DisplayName.Contains($_)   

    if ($check) {

        $number = $winSrvInstances.DisplayName.IndexOf($_)

        $tmp    = $winSrvInstances[$number]

        if ($tmp -ne '') {       




} #end $computersSQL2to8 | ForEach-Object { }

# Disabling the Dicovery Rules for those computers which host SQL 2005 to SQL 2012

$discoveryRuleList = @(

  'Discover SQL Server 2005 Database Engines (Windows Server)'

  'Discover SQL Server 2008 Database Engines (Windows Server)'

 'Discover SQL Server 2012 Database Engines (Windows Server)'


foreach ($discoveryRuleItem in $discoveryRuleList) {

$discoveryRule = Get-SCOMDiscovery -DisplayName $discoveryRuleItem

$targetComputers | ForEach-Object {

Disable-SCOMDiscovery -Instance $_ -Discovery $discoveryRule -ManagementPack $overrideMP



#Removing the objects from SCOM. - This can take some Time!


# Get Express instances for SQL 2014

$classSQL2014     = Get-SCOMClass -Name 'Microsoft.SQLServer.2014.DBEngine'

$instancesSQL2014 = Get-SCOMClassInstance -Class $classSQL2014

$expressSQL2014   = $instancesSQL2014 | Where-Object {$_.'[Microsoft.SQLServer.2014.DBEngine].Edition'.Value  -eq 'Express Edition' }

$computersSQL2014 = $expressSQL2014.path

# Finding the computer objects which host SQL 2014, store them in ArrayList

$targetComputers   = New-Object -TypeName System.Collections.ArrayList

$computersSQL2014 | ForEach-Object {

    $tmp   = ''

    $check = $false

    $check = $winSrvInstances.DisplayName.Contains($_)

    if ($check) {

        $number = $winSrvInstances.DisplayName.IndexOf($_)

        $tmp    = $winSrvInstances[$number]

        if ($tmp -ne '') {       




} #end $computersSQL2014 | ForEach-Object { }

# Disabling the Dicovery Rule for those computers which host SQL 2014

$discoveryRule = Get-SCOMDiscovery -DisplayName 'MSSQL 2014: Discover SQL Server 2014 Database Engines'

$targetComputers | ForEach-Object {

Disable-SCOMDiscovery -Instance $_ -Discovery $discoveryRule -ManagementPack $overrideMP


#Removing the objects from SCOM. - This can take some Time!


# Get Express instances for SQL 2016

$classSQL2016     = Get-SCOMClass -Name 'Microsoft.SQLServer.2016.DBEngine'

$instancesSQL2016 = Get-SCOMClassInstance -Class $classSQL2016

$expressSQL2016   = $instancesSQL2016 | Where-Object {$_.'[Microsoft.SQLServer.2016.DBEngine].Edition'.Value  -eq 'Express Edition' }

$computersSQL2016 = $expressSQL2016.Path

# Finding the computer objects which host SQL 2016, store them in ArrayList

$targetComputers   = New-Object -TypeName System.Collections.ArrayList

$computersSQL2016 | ForEach-Object {

    $tmp = ''

    $check = $false    

    $check = $winSrvInstances.DisplayName.Contains($_)

    if ($check) {

        $number = $winSrvInstances.DisplayName.IndexOf($_)

        $tmp = $winSrvInstances[$number]

        if ($tmp -ne '') {       




} #end $computersSQL2016 | ForEach-Object { }

# Disabling the Dicovery Rule for those computers which host SQL 2016

$discoveryRule = Get-SCOMDiscovery -DisplayName 'MSSQL 2016: Discover SQL Server 2016 Database Engines'

$targetComputers | ForEach-Object {

    Disable-SCOMDiscovery -Instance $_ -Discovery $discoveryRule -ManagementPack $overrideMP


#Removing the objects from SCOM. - This can take some Time!


Feedback is appreciated 😊

Warm regards


Hybrid Monitoring Solutions during your transition to Cloud

Most enterprises now have either moved to cloud, or are moving towards it. And why not? Running your workloads on cloud services such as Azure frees you up from a lot of maintenance and administrative overheads, and you can use this time to do something better.

Here are some major benefits to moving to cloud:

  1. Less administration tasks – The cloud providers are responsible for managing and upgrading their infrastructure and so the customer does not have to worry about that.
  2. Cloud is flexible – It can adjust to the rapid growth or the fluctuations in business and adopt to that to provide you the optimized resources and hence managing costs.
  3. Cost efficient – Since you don’t have to spend on the big hardware and the maintenance that comes along with it, you can save that initial capital investment. Moreover, on cloud you mostly only pay for what you use and for the time you use it, it saves a lot of cost there as well.
  4. Disaster Recovery – Not every company, especially smaller sized, can invest into a Disaster Recovery strategy. On premise, it’s basically like running two datacenters and so double the cost. Moving to cloud eliminates that since the cloud provider is responsible to provide resiliency on their side to make sure your servers are up and running even if there is any hardware failure.

These are just some of the major benefits transitioning to cloud provides, there are many more. So if you’ve made a decision to move to the cloud – you’re looking at the right direction!

Now, on premise or on cloud – monitoring your infrastructure is equally critical. While the cloud provider will look after the hardware components, monitoring your servers and applications is still your responsibility, and something that you need to invest the time, money and efforts into. There are some great tools out there in the market to let you effectively monitor your infrastructure, like Microsoft’s System Center Operations Manager (SCOM), and Azure Monitor, which is a monitoring solution residing in Azure. So which one should you use to monitor your infrastructure?

Since you are transitioning to cloud (say Azure), you already have an on premise infrastructure. That most likely means you also already have made an investment in a tool like SCOM for monitoring it. So now you’re wondering, “So…does moving to Azure mean I have to decommission my SCOM now and move my monitoring to Azure Monitor?”

The good news is – you don’t have to choose between SCOM and Azure Monitor at all! (click to read more). They work the best together in a hybrid environment and complement each other very well.

SCOM is generally considered better in monitoring on premise workloads and has been used for it since a very long time. SCOM provides deep insights and a very thorough leveled monitoring of the workloads you want to monitor. It is also very easy to monitor your custom applications by authoring your own management packs. In short, it gives you a more detailed look into your infrastructure and alerts you based on it.

Azure Monitor on the other hand suits the best for Azure resources. Since it does not require installation, it is up and running in a matter of minutes. It also does not require you to worry about maintaining it, upgrading it or troubleshooting it. It is highly scalable, which means you can start on-boarding your servers immediately without worrying about the underlying infrastructure sizing capacity. However the biggest highlight of using Azure Monitor is probably its ability to query the data. Once the agent collects the data you can query it and get very granular. It is a very efficient way to make sure you’re only dealing with the data you want, and are only alerted for what you’re concerned with.

SCOM integrates seamlessly with Azure Monitor and can upload all the data it is collecting on premise to Azure Monitor where it can be queried. There are some great advantages of integrating SCOM with Azure Monitor, for example:

  1. You’re now getting more useful data rather than spam. Azure Monitor’s querying capability plays an important role here. You collect the data from Azure resources as well as on premise servers, and only extracting the data you need for alerting meaningful to you.
  2. Azure Monitor provides a single pane of glass for alerts and ways to manage them across your infrastructure, so it reduces the administrative overhead considerably.
  3. With all the data going into Azure Monitor, you can actually shut off a lot of workloads you don’t need from SCOM which means better performance with less resources used!
  4. SCOM monitors what it monitors the best – on premise infrastructure while Azure Monitor Monitors what it monitors the best – Cloud resources.
  5. You can reduce the dependency on only one monitoring solution, and run these two in parallel for resiliency.
  6. You can leverage PowerBI to visualize the data
  7. With release of SCOM 2019, with all its new capabilities and better visibility into cloud resources, this integration has become even better!
  8. It is much more cost-effective considering the returns it provides in value in a long run.

Hope this helps you plan your transition to cloud while maintaining the monitoring it all!

(Featured image credits to Microsoft!)


SCOM 2019 vs Azure Monitor: Which one to choose?

Having worked with both SCOM and Azure Monitor, recently I was asked to compare them both and suggest the right choice. First off, I have a disclaimer to make – Azure Monitor is great, but it can not replace SCOM entirely, not just yet.

SCOM 2019 was recently released and it came loaded with some great new features. Read more about it here. What I especially like is the new capabilities of it to monitor Azure resources. It now has insights more than ever before into the cloud. And with the ever rising numbers of cloud migrations or new cloud deployments, Azure Monitor’s popularity and importance keeps getting higher.

However, I believe these two tools have their own “personality” if you will, and work the best with each other. Here’s what I have to say about this in more details:

Defining Your Enterprise Monitoring Strategy: Close the Gaps with SCOM 2019 and Azure Monitor