Tag: SCOM

SCOM Troubleshooting: SQL Monitoring failed – Error after installing SCOM MP 7.0.15 for MS SQL – By Ruben Zimmermann

In this troubleshooting tip, Ruben talks about fixing the error you might encounter after you install the Management Pack for SQL version 7.0.15.

Introduction

After we updated the MS SQL Management Pack to 7.0.15 several servers threw alerts that monitoring isn’t working any more.

The error message stated that a WMI query did not return any value.

Details

Shortly after we updated the Management Pack to 7.0.15 we received error messages from a couple of SQL Servers.

The WMI query that is triggered by the DiscoverSQL2012DBEngineDiscovery.vbs did not return any valid instances.

Furthermore, on the SQL Server the SQL Server Configuration Manager could not be started anymore and terminated with the error that it could not connect to the WMI  provider.

Solution

Rebuilding provider on the affected SQL Servers.

  1. Start the command prompt (cmd)
  2. Navigate to (cd) “C:\ProgramFiles(x86)\MicrosoftSQLServer\110\Shared\” where 110 depends on the g the SQL server version installed on that machine
  3. Run the following command: mofcomp sqlmgmproviderxpsp2up.mof

Hope this helps!

Cheers

SCOM Troubleshooting – Missing Reports

Recently I was deploying a new SCOM 2019 management group. I had Kevin’s SCOM 2019 deployment guide on the side for reference and to organize a general flow of the process. Everything was going great, until I came to the part of installing the reporting server. Now, I was following the guide closely and had all the best practices for permissions for all my SCOM accounts. But, still I was having a hard time installing reporting.

I installed SSRS, validated the URL, all was good. Then when I was running the reporting server setup, it was taking a long time and would finally fail. Now this was just for POC and I figured it was better to just start this piece again instead of spending time with the troubleshooting, looking into logs and stuff. So, that’s what I did. Cleaned up everything, uninstalled SSRS and ran the installer again. Well, this time again it took a lot of time, but was actually successful. But when I went into reporting tab and validate the installation, I saw this:

Hmm…not good. I see a bunch of reports missing here. So I started looking into the logs. And indeed, this is what I found:

Type: Error

Event ID: 26319 

User: N/A

Computer: Computername
Description: An exception was thrown while processing GetUserRolesForOperationAndUser for session id uuid:UUID. Exception Message: Access is denied. (Exception from HRESULT: 0x80070005 (E_ACCESSDENIED)) Full Exception: System.UnauthorizedAccessException: Access is denied. (Exception fro HRESULT: 0x80070005 (E_ACCESSDENIED))

Exception Message: Access is denied. (Exception from HRESULT: 0x80070005 (E_ACCESSDENIED)) Full Exception: System.UnauthorizedAccessException: Access is denied. (Exception from HRESULT: 0x80070005 (E_ACCESSDENIED)) at Microsoft.Interop.Security.AzRoles.IAzApplication2.InitializeClientContextFr omStringSid(String SidString, Int32 lOptions, Object varReserved) at Microsoft.EnterpriseManagement.Mom.Sdk.Authorization.AzManHelper.GetScopedRo leAssignmentsForUser(IList`1 roleNames, String userName) at Microsoft.EnterpriseManagement.Mom.Sdk.Authorization.AuthManager.GetUserRole sForOperationAndUser(Guid operationId, String userName) at Microsoft.EnterpriseManagement.Mom.ServiceDataLayer.SdkDataAccess.GetUserRol esForOperationAndUser(Guid operationId, String userName) at Microsoft.EnterpriseManagement.Mom.ServiceDataLayer.SdkDataAccessTieringWrap per.GetUserRolesForOperationAndUser(Guid operationId, String userName) at Microsoft.EnterpriseManagement.Mom.ServiceDataLayer.SdkDataAccessExceptionTr acingWrapper.GetUserRolesForOperationAndUser(Guid operationId, String userName)

No worries though, a quick search brought me this:

When you try to install Microsoft System Center Operations Manager 2007 Reporting, the installation is unsuccessful

Now this link talks about issues with OS Windows 2000, but I was using Windows 2016, so I should not be technically facing this issue. However, since the accounts I was using were local accounts and not domain admins I still decided to try it out and add the SDK (DAS account in SCOM 2019) account in the Windows Authorization Access Group, which actually seemed to work. Slowly but steadily, all the reports started appearing. Problem solved! 😉

Hope this helps!

Cheers

Automate SQL Express Instance removal from SCOM with PowerShell – Ruben Zimmermann

Ruben is back again with another Powershell banger! This time he presents you a Powershell script that will automatically detect and remove the SQL Express Instances from SCOM monitoring and save you from unnecessary overhead of removing them manually!

Introduction

SQL Express Databases are a widely used storage for settings in applications or as data storage for small amount of data. Except of backups in cases it is not required to manage those databases.

The MS SQL Server Management Pack for SCOM discovers any edition. Thus, we can spot Express databases from the SCOM Console.

Unfortunately, the Management Pack can’t monitor Express databases and lots of unfixable alerts are thrown.

Manual solution

As described by Tim McFadden here https://www.scom2k7.com/disabling-sql-express-instance-discoveries/

or by  Kevin Holman here

https://kevinholman.com/2010/02/13/stop-monitoring-sql-express-and-windows-internal-database/

It is possible to either set filter strings to prevent the discovery for all Express instances by name.

This does not work if the Express is named as MSSQLSERVER.

MSSQLSERVER is also the default for SQL Standard and other editions.

Only choice then is to override object by object manually, or?

PowerShell solution

With a bit of PowerShell it is possible to override the discovery rules for Express editions no matter which name they have. – Put this script into your regular maintenance scripts to keep your SCOM free from Express instances:

# Defining Override Management Pack. - It needs to be created before starting.

$overrideMP = Get-SCOMManagementPack -DisplayName 'Custom.SQL.Server.Express.Removals'

# Get all Windows Servers Instances, needed as lookup to disable the specific computer object

$winSrvClass     = Get-SCOMClass -Name Microsoft.Windows.Server.Computer

$winSrvInstances = Get-SCOMClassInstance -Class $winSrvClass

# Get Express instances For SQL 2005 to SQL 2012

$classSQL2to8     = Get-SCOMClass -Name Microsoft.SQLServer.DBEngine

$instancesSQL2to8 = Get-SCOMClassInstance -Class $classSQL2to8

$expressSQL2to8   = $instancesSQL2to8 | Where-Object {$_.'[Microsoft.SQLServer.DBEngine].Edition'.Value  -eq 'Express Edition' }

$computersSQL2to8 = $expressSQL2to8.path

# Finding the computer objects which host SQL 2005 to SQL2012, store them in ArrayList

$targetComputers = New-Object -TypeName System.Collections.ArrayList

$computersSQL2to8 | ForEach-Object {

    $tmp   = ''

    $check = $false

    $check = $winSrvInstances.DisplayName.Contains($_)   

    if ($check) {

        $number = $winSrvInstances.DisplayName.IndexOf($_)

        $tmp    = $winSrvInstances[$number]

        if ($tmp -ne '') {       

            $targetComputers.Add($tmp)

        }   

    }

} #end $computersSQL2to8 | ForEach-Object { }

# Disabling the Dicovery Rules for those computers which host SQL 2005 to SQL 2012

$discoveryRuleList = @(

  'Discover SQL Server 2005 Database Engines (Windows Server)'

  'Discover SQL Server 2008 Database Engines (Windows Server)'

 'Discover SQL Server 2012 Database Engines (Windows Server)'

)

foreach ($discoveryRuleItem in $discoveryRuleList) {

$discoveryRule = Get-SCOMDiscovery -DisplayName $discoveryRuleItem

$targetComputers | ForEach-Object {

Disable-SCOMDiscovery -Instance $_ -Discovery $discoveryRule -ManagementPack $overrideMP

    }

}

#Removing the objects from SCOM. - This can take some Time!

Remove-SCOMDisabledClassInstance

# Get Express instances for SQL 2014

$classSQL2014     = Get-SCOMClass -Name 'Microsoft.SQLServer.2014.DBEngine'

$instancesSQL2014 = Get-SCOMClassInstance -Class $classSQL2014

$expressSQL2014   = $instancesSQL2014 | Where-Object {$_.'[Microsoft.SQLServer.2014.DBEngine].Edition'.Value  -eq 'Express Edition' }

$computersSQL2014 = $expressSQL2014.path

# Finding the computer objects which host SQL 2014, store them in ArrayList

$targetComputers   = New-Object -TypeName System.Collections.ArrayList

$computersSQL2014 | ForEach-Object {

    $tmp   = ''

    $check = $false

    $check = $winSrvInstances.DisplayName.Contains($_)

    if ($check) {

        $number = $winSrvInstances.DisplayName.IndexOf($_)

        $tmp    = $winSrvInstances[$number]

        if ($tmp -ne '') {       

            $targetComputers.Add($tmp)

        }   

    }

} #end $computersSQL2014 | ForEach-Object { }

# Disabling the Dicovery Rule for those computers which host SQL 2014

$discoveryRule = Get-SCOMDiscovery -DisplayName 'MSSQL 2014: Discover SQL Server 2014 Database Engines'

$targetComputers | ForEach-Object {

Disable-SCOMDiscovery -Instance $_ -Discovery $discoveryRule -ManagementPack $overrideMP

}

#Removing the objects from SCOM. - This can take some Time!

Remove-SCOMDisabledClassInstance

# Get Express instances for SQL 2016

$classSQL2016     = Get-SCOMClass -Name 'Microsoft.SQLServer.2016.DBEngine'

$instancesSQL2016 = Get-SCOMClassInstance -Class $classSQL2016

$expressSQL2016   = $instancesSQL2016 | Where-Object {$_.'[Microsoft.SQLServer.2016.DBEngine].Edition'.Value  -eq 'Express Edition' }

$computersSQL2016 = $expressSQL2016.Path

# Finding the computer objects which host SQL 2016, store them in ArrayList

$targetComputers   = New-Object -TypeName System.Collections.ArrayList

$computersSQL2016 | ForEach-Object {

    $tmp = ''

    $check = $false    

    $check = $winSrvInstances.DisplayName.Contains($_)

    if ($check) {

        $number = $winSrvInstances.DisplayName.IndexOf($_)

        $tmp = $winSrvInstances[$number]

        if ($tmp -ne '') {       

            $targetComputers.Add($tmp)

        }

    }

} #end $computersSQL2016 | ForEach-Object { }

# Disabling the Dicovery Rule for those computers which host SQL 2016

$discoveryRule = Get-SCOMDiscovery -DisplayName 'MSSQL 2016: Discover SQL Server 2016 Database Engines'

$targetComputers | ForEach-Object {

    Disable-SCOMDiscovery -Instance $_ -Discovery $discoveryRule -ManagementPack $overrideMP

}

#Removing the objects from SCOM. - This can take some Time!

Remove-SCOMDisabledClassInstance

Feedback is appreciated 😊

Warm regards

Ruben

Hybrid Monitoring Solutions during your transition to Cloud

Most enterprises now have either moved to cloud, or are moving towards it. And why not? Running your workloads on cloud services such as Azure frees you up from a lot of maintenance and administrative overheads, and you can use this time to do something better.

Here are some major benefits to moving to cloud:

  1. Less administration tasks – The cloud providers are responsible for managing and upgrading their infrastructure and so the customer does not have to worry about that.
  2. Cloud is flexible – It can adjust to the rapid growth or the fluctuations in business and adopt to that to provide you the optimized resources and hence managing costs.
  3. Cost efficient – Since you don’t have to spend on the big hardware and the maintenance that comes along with it, you can save that initial capital investment. Moreover, on cloud you mostly only pay for what you use and for the time you use it, it saves a lot of cost there as well.
  4. Disaster Recovery – Not every company, especially smaller sized, can invest into a Disaster Recovery strategy. On premise, it’s basically like running two datacenters and so double the cost. Moving to cloud eliminates that since the cloud provider is responsible to provide resiliency on their side to make sure your servers are up and running even if there is any hardware failure.

These are just some of the major benefits transitioning to cloud provides, there are many more. So if you’ve made a decision to move to the cloud – you’re looking at the right direction!

Now, on premise or on cloud – monitoring your infrastructure is equally critical. While the cloud provider will look after the hardware components, monitoring your servers and applications is still your responsibility, and something that you need to invest the time, money and efforts into. There are some great tools out there in the market to let you effectively monitor your infrastructure, like Microsoft’s System Center Operations Manager (SCOM), and Azure Monitor, which is a monitoring solution residing in Azure. So which one should you use to monitor your infrastructure?

Since you are transitioning to cloud (say Azure), you already have an on premise infrastructure. That most likely means you also already have made an investment in a tool like SCOM for monitoring it. So now you’re wondering, “So…does moving to Azure mean I have to decommission my SCOM now and move my monitoring to Azure Monitor?”

The good news is – you don’t have to choose between SCOM and Azure Monitor at all! (click to read more). They work the best together in a hybrid environment and complement each other very well.

SCOM is generally considered better in monitoring on premise workloads and has been used for it since a very long time. SCOM provides deep insights and a very thorough leveled monitoring of the workloads you want to monitor. It is also very easy to monitor your custom applications by authoring your own management packs. In short, it gives you a more detailed look into your infrastructure and alerts you based on it.

Azure Monitor on the other hand suits the best for Azure resources. Since it does not require installation, it is up and running in a matter of minutes. It also does not require you to worry about maintaining it, upgrading it or troubleshooting it. It is highly scalable, which means you can start on-boarding your servers immediately without worrying about the underlying infrastructure sizing capacity. However the biggest highlight of using Azure Monitor is probably its ability to query the data. Once the agent collects the data you can query it and get very granular. It is a very efficient way to make sure you’re only dealing with the data you want, and are only alerted for what you’re concerned with.

SCOM integrates seamlessly with Azure Monitor and can upload all the data it is collecting on premise to Azure Monitor where it can be queried. There are some great advantages of integrating SCOM with Azure Monitor, for example:

  1. You’re now getting more useful data rather than spam. Azure Monitor’s querying capability plays an important role here. You collect the data from Azure resources as well as on premise servers, and only extracting the data you need for alerting meaningful to you.
  2. Azure Monitor provides a single pane of glass for alerts and ways to manage them across your infrastructure, so it reduces the administrative overhead considerably.
  3. With all the data going into Azure Monitor, you can actually shut off a lot of workloads you don’t need from SCOM which means better performance with less resources used!
  4. SCOM monitors what it monitors the best – on premise infrastructure while Azure Monitor Monitors what it monitors the best – Cloud resources.
  5. You can reduce the dependency on only one monitoring solution, and run these two in parallel for resiliency.
  6. You can leverage PowerBI to visualize the data
  7. With release of SCOM 2019, with all its new capabilities and better visibility into cloud resources, this integration has become even better!
  8. It is much more cost-effective considering the returns it provides in value in a long run.

Hope this helps you plan your transition to cloud while maintaining the monitoring it all!

(Featured image credits to Microsoft!)

Cheers

SCOM 2019 vs Azure Monitor: Which one to choose?

Having worked with both SCOM and Azure Monitor, recently I was asked to compare them both and suggest the right choice. First off, I have a disclaimer to make – Azure Monitor is great, but it can not replace SCOM entirely, not just yet.

SCOM 2019 was recently released and it came loaded with some great new features. Read more about it here. What I especially like is the new capabilities of it to monitor Azure resources. It now has insights more than ever before into the cloud. And with the ever rising numbers of cloud migrations or new cloud deployments, Azure Monitor’s popularity and importance keeps getting higher.

However, I believe these two tools have their own “personality” if you will, and work the best with each other. Here’s what I have to say about this in more details:

Defining Your Enterprise Monitoring Strategy: Close the Gaps with SCOM 2019 and Azure Monitor

Cheers!

Export All SCOM Subscriptions into a Text File

A few days ago I was in need to export all my SCOM subscriptions and be able to analyze them thoroughly. So I researched for a script/solution online but didn’t find anything particularly useful to my exact requirement. So I decided to write one myself!

I wrote a quick version of the script to get the job done, and indeed it was an investment of time well made! The script really spells out everything and it is very easy to analyze it visually and understand the relationships between the subscriptions, the channels as well as the subscribers. Sample output:

SCOM-Bob Cornelissen was gracious enough to publish the blog on his website as a guest post, read it here:

Export SCOM Subscriptions using Powershell

Cheers!

Enable SCOM subscriptions with Powershell with “Fewer messages”

This is one widely known “inconvenience” with working with Powershell to manage your subscriptions in a big SCOM environment. Consider the following widely common scenario:

You have a SCOM environment with a fairly large number of subscriptions. You have a planned maintenance scheduled, and so you plan to disable all your subscriptions so that the support teams aren’t bothered with unnecessary alerts. There is a simple quick way of doing this:

Import-Module OperationsManager
Get-SCOMNotificationSubscription | Disable-SCOMNotificationSubscription

This will disable all your subscriptions and stop sending notifications out. This is simple enough. The issue lies with enabling them back. The catch here is when you enable the subscription using Powershell, you start getting all the notifications that were “cached” during the time of the maintenance mode, and your support teams are bombarded with tens or even hundreds of such spam notifications. This is because when you enable the subscriptions with Powershell, it enabled it using the default option of “More messages”. Moreover, there is no apparent parameter/switch to change it to “Fewer messages”.

Get-SCOMNotificationSubscription | Enable-SCOMNotificationSubscription

If you are not familiar with these two options, here’s what they mean:

More Messages: This option means all the notifications that were cached since the subscription was disabled are forwarded to subscribers.

Fewer Messages: This option means only the notifications that are generated after the subscription was re-enabled are forwarded to the subscribers.

This discussion was actually happening a few days ago on the SCOM Community Gitter Lobby (make sure to join!), and Dimitry K. suggested an excellent workaround on this issue (Kudos, Dimitry!). There’s no parameter in the cmdlet to switch options but you can actually do this using a method. Here’s the code:

$sub = Get-SCOMNotificationSubscription | where {$_.displayname -like "SUBSCRIPTION_NAME"}
$sub.Enabled = $true
$sub.Update($true)

Note the last line – $sub.Update($true)

The value you pass to this method actually determines the option to choose more or fewer messages. If you choose Update($true), it is equivalent to fewer messages, and if you choose Update($false), it is equivalent to more messages in the GUI.

Hope this helps, and save you and your support teams from a bunch of spam emails!

Cheers

Service Uptime Report in SCOM

This is a question that I often get asked by the customers I work with and apparently a lot of others as evident by the related questions on the forums.

One way of doing it is to author your own service monitor, but that involves considerable amount of knowledge and experience of management packs and the underlying coding. It usually takes a lot of time as well. Not everyone has the right knowledge or the time to spend on it so I thought I’d share a quick trick I do to measure uptime of a service and also be able to present it to the concerned parties in the form of a report.

It often happens that you have a service running on your servers and many organizations use it as a “proof” to show that the application was running, or maybe as analysis for troubleshooting, hence it becomes necessary to be able to measure the uptime of the service accurately and to be able to show it to the management and/or to pass it around.

The thing is, when you’re creating a monitor to measure availability of a service in SCOM, you usually choose a “Basic Service Monitor”. This monitor is not very “intelligent” and simply places an instance of itself on every server belonging to the class that you choose. However, you do have another option to monitor your service with, and it is the “Windows Service Template”. This type provides you much more features and finer control on your service monitoring. I wrote a blog earlier comparing these two options of service monitoring and when to use what.

SCOM basic service monitor Vs Windows Service Template

So the way the Windows Service template works is that it creates a discovery of it’s own and hence creates an actual class. This class can further be used to target other workflows that you may have to monitor this class of servers. Another advantage of that is you can now use this class to fetch a “Service Availability” report. For example:

Let’s say I’m monitoring the Spooler service on a bunch of servers, and I need to be able to see the uptime of this service on each of these servers. So I create a Windows Service Template monitor, call it “Test Spooler”.

Now once it’s done, go to “Discovered Inventory” tab under Monitoring. Click on “Change the target type” and you can see that now the “Test Spooler” is a class available for targeting.

This means you can also target your availability report to this target as well and measure the uptime of the service it is monitoring.

You can also fetch a report for a group of service monitors created this way. There’s a good discussion we had a while back regarding this exact requirement:

Service Availability Report

Hope that helps!

Cheers

SCOM 2012 R2 to 1801 Side-by-Side Migration : The Powershell Way! – By Ruben Zimmermann

Ruben is back with another awesome blog post, and I have no doubt this one is going to help a lot of people!

We all know the hassles of migrating your SCOM from one version to another, and it involves a lot of manual efforts. And especially so when you choose the side-by-side upgrade path. Making sure you have all the Management Packs re-imported, all overrides are intact, making sure all permissions to various users you’ve assigned over the years are still in place, etc just to mention a few examples. We’ve all wished how great it would be if you could run some kind of script and it’ll do it all for us. Well, this is exactly what Ruben has made available for us! Here, take a look:

Preface

Migrating from SCOM 2012 R2 to SCOM 1801 isn’t a stressful job as both environments can live in parallel.

This blog gives examples how to migrate the configuration from A to B by trying to leverage PowerShell whenever possible / reasonable.

Please apply the PowerShell code only if you feel comfortable with it. If there are questions, please don’t hesitate to drop me a line.

Introduction

Although it is possible letting the agent do duplicate work in the sense of executing code in management packs, sending data to different management groups can cause excessive resource consumption on the monitored computer.

I suggest:

  • Migrate configuration to 1801 from 2012 R2 (blog topic)
  • Test with a very small amount of machine of different types (Application Servers, Web Servers, MSSQL, Exchange, Domain Controllers)
  • Move the remaining, majority to 1801
  • Change connectors to ticket systems, alert devices other peripheral systems
  • Remove the agent connections from 2012 R2 (see below for details)
  • Monitor the new environment for a while, if all fine decommission 2012 R2

Requirements

  • 1801 is setup fully functional and first agents have been deployed successfully.
  • Windows PowerShell 5.1 is required on the 2012 R2 and on the 1801 server
  • Service Accounts used in 2012 R2 will be re-used. – If they are different in 1801 it is no big obstacle to change them if they are overwritten by one of the import steps.
    • Usually SCOM will alert if there is misconfiguration such a lack of permission somewhere.

Migration

Below now the steps separated in the different parts. – No guarantee for complete-  or correctness.

Management Packs

Migrating to a new environment is always a great chance to perform cleanup and apply the lessons you’ve learned in the old one.

Review

Export all Management Packs from the current SCOM environment and review.

Get-SCOMManagementPack | Select-Object -Property Name, DisplayName, TimeCreated, LastModified, Version | Export-Csv -Path C:\temp\CurrentMps.csv -NoTypeInformation

For your convenience use Excel to open the CSV file. Import only those Management Packs which bring measurable benefit.

Overrides

Standard Approach

If you have followed best practices you have created Override Management Packs for each Management Pack to store your customizations. Export those and import them into your new environment.

Note: Verify that the overrides work as expected.

Green field approach

In case you have stored the all overrides only in one Management Pack or want to be more selective follow these steps

  1. Create a new Override Management Pack for that specific MP and name it properly.
    1. E.g. <YourCompayName>.Windows.Server.Overrides

    2. Follow the steps mentioned in ‘Management Pack tuning’.
      https://anaops.com/2018/06/22/guest-blog-management-pack-tuning-ruben-zimmermann/

Notification Settings (Channels, Subscribers and Subscriptions)

Quoted from: http://realscom.blogspot.de/2017/05/migrate-notifications-to-another.html

  • Export the Microsoft.SystemCenter.Notifications.Internal mp in the old and new management groups – do not modify these, they are our backup copies
  • Make a copy of both files and rename them to something meaningful like Microsoft.SystemCenter.Notifications.Internal.ManagementGroupName_backup.xml
  • Make a note of the MP version number of the Microsoft.SystemCenter.Notifications.Internal MP from the new management group. In my case it was 7.2.11719.0
  • Open up the Microsoft.SystemCenter.Notifications.Internal.xml file for the old management group and change the version number to that of the new management group MP version + 1. In my case I changed it from 7.0.9538.0 to 7.2.11719.1. This is so the MP imports properly in the new management group

Exporting Configuration with PowerShell

Open a PowerShell console on the 2012 R2 Management Server and use the following cmdlets to store the information to C:\Temp\ResolutionStates.json

Alert Resolution State

Only user defined resolution states are exported.

Get-SCOMAlertResolutionState | Where-Object {$_.IsSystem -eq $false} | Select-Object Name, ResolutionState | ConvertTo-Json | Out-File C:\Temp\ResolutionStates.json
Alert Auto – Resolution Settings

Limited to the properties AlertAutoResolveDays and HealthyAlertAutoResolveDays

Get-SCOMAlertResolutionSetting | Select-Object AlertAutoResolveDays, HealthyAlertAutoResolveDays | ConvertTo-Json | Out-File C:\Temp\SCOMExport\AlertResolutionSetting.json
Database Grooming Settings (hint to dwarp! Plus link)

Exports data retention settings for information in the OperationsManager database. The DataWareHouse database is covered later.

Get-SCOMDatabaseGroomingSetting | Select-Object AlertDaysToKeep, AvailabilityHistoryDaysToKeep,EventDaysToKeep,JobStatusDaysToKeep,MaintenanceModeHistoryDaysToKeep,MonitoringJobDaysToKeep,PerformanceDataDaysToKeep,PerformanceSignatureDaysToKeep,StateChangeEventDaysToKeep | ConvertTo-Json | Out-file C:\Temp\SCOMExport\DatabaseGrooming.json
User Roles

Exporting User roles and System roles into dedicated files.

Get-SCOMUserRole | Where-object {$_.IsSystem -eq $false } | Select-Object -Property Name, ProfileDisplayName, Users | ConvertTo-Json | Out-File C:\Temp\SCOMExport\UserRoles.json
Get-SCOMUserRole | Where-object {$_.IsSystem -eq $true }  | Select-Object -Property Name, ProfileDisplayName, Users | ConvertTo-Json | Out-File C:\Temp\SCOMExport\SystemUserRoles.json
Run As Accounts

Exporting user defined RunAsAccounts. System defined ones are created by Management Packs.

Get-SCOMRunAsAccount | Where-Object {$_.isSystem -eq $false} | Select-Object -Property AccountType, UserName, SecureStorageId, Name, Description, SecureDataType | ConvertTo-Json | Out-File C:\Temp\SCOMExport\RunAsAccounts.json
<# The cmdlet 'Get-SCOMRunAsAccount' can't extract credential information. A free cmdlet, written by Stefan Roth is required for that. – Import it to your 2012 R2 environment before proceeding with the following lines.

https://www.powershellgallery.com/packages/RunAsAccount/1.0 #>
Get-SCOMRunAsAccount | Select-Object -ExpandProperty Name | ForEach-Object {

    try {  

        $theName  = $_

        $theTmp  = Get-RunAsCredential -Name $theName

        $thePass = ($theTmp.Password).ToString()

        $thePass




        $secPass  = ConvertTo-SecureString $thePass -AsPlainText -Force

        $theCreds = New-Object -TypeName System.Management.Automation.PSCredential($theName,$secPass)




        $theCreds | Export-Clixml -Path "C:\temp\SCOMExport\$theName.cred"

        $fileName =  "C:\temp\SCOMExport\" + $theName + ".txt"

        "$($thePass)" | Out-File -FilePath $fileName

    } catch {

        $info = 'Swallowing exception'

    }

}

Importing Configuration with PowerShell

Copy the JSON files to the 1801 Management Server (C:\Temp\SCOM-Scripts)  and import using the following code:

Alert Resolution State
$jsonFileContent = Get-Content -Path C:\Temp\SCOM-Scripts\ResolutionStates.json | ConvertFrom-Json



foreach ($aState in $jsonFileContent) { 

Add-SCOMAlertResolutionState -Name $aState.Name -ResolutionStateCode $aState.ResolutionState

}


Alert Resolution Setting
$jsonFileContent = Get-Content -Path C:\Temp\SCOM-Scripts\AlertResolutionSetting.json | ConvertFrom-Json

Set-SCOMAlertResolutionSetting -AlertAutoResolveDays $jsonFileContent.AlertAutoResolveDays -HealthyAlertAutoResolveDays $jsonFileContent.HealthyAlertAutoResolveDays

Database Grooming Settings
$jsonFileContent = Get-Content -Path C:\Temp\SCOM-Scripts\DatabaseGrooming.json | ConvertFrom-Json

Set-SCOMDatabaseGroomingSetting -AlertDaysToKeep $jsonFileContent.AlertDaysToKeep -AvailabilityHistoryDaysToKeep $jsonFileContent.AvailabilityHistoryDaysToKeep -EventDaysToKeep $jsonFileContent.EventDaysToKeep

Set-SCOMDatabaseGroomingSetting -JobStatusDaysToKeep $jsonFileContent.JobStatusDaysToKeep -MaintenanceModeHistoryDaysToKeep $jsonFileContent.MaintenanceModeHistoryDaysToKeep -MonitoringJobDaysToKeep $jsonFileContent.MonitoringJobDaysToKeep

Set-SCOMDatabaseGroomingSetting -PerformanceDataDaysToKeep $jsonFileContent.PerformanceDataDaysToKeep -PerformanceSignatureDaysToKeep $jsonFileContent.PerformanceSignatureDaysToKeep -StateChangeEventDaysToKeep $jsonFileContent.StateChangeEventDaysToKeep        

User Roles
# Importing User roles and System roles from dedicated files. – Review them first.

$jsonFileContent = Get-Content -Path C:\Temp\SCOM-Scripts\UserRoles.json | ConvertFrom-Json

foreach ($uRole in $jsonFileContent) {

    switch ($uRole.ProfileDisplayName) {       

        'Advanced Operator' {

            Add-SCOMUserRole -Name $uRole.Name -DisplayName $uRole.Name -Users $uRole.Users -AdvancedOperator           

        }       

        'Author' {

            Add-SCOMUserRole -Name $uRole.Name -DisplayName $uRole.Name -Users $uRole.Users -Author

        }

        'Operator' {

            Add-SCOMUserRole -Name $uRole.Name -DisplayName $uRole.Name -Users $uRole.Users -Operator

        }

        'Read-Only Operator' {

            Add-SCOMUserRole -Name $uRole.Name -DisplayName $uRole.Name -Users $uRole.Users -ReadOnlyOperator

        }       

    }       

}




$jsonFileContent = Get-Content -Path C:\Temp\SCOM-Scripts\SystemUserRoles.json | ConvertFrom-Json

foreach ($sRole in $jsonFileContent) {

    if ($sRole.Users) {   

        Get-SCOMUserRole | Where-Object {$_.Name -eq $sRole.Name} | Set-SCOMUserRole -User $sRole.Users

    } else {

        continue

    }    

}
Run As Accounts
# Importing User roles and System roles from dedicated files. – Review them first.

$jsonFileContent = Get-Content -Path C:\Temp\SCOM-Scripts\RunAsAccounts.json | ConvertFrom-Json



foreach($rAccount in $jsonFileContent) {




    $theName  = $rAccount.Name   

    if ($theName -match 'Data Warehouse') {

        write-warning "Skipping default account $theName"

        continue

    }

    switch ($rAccount.AccountType) {

        'SCOMCommunityStringSecureData' {                       

            $credFile =  Get-Content -Path "C:\Temp\SCOM-Scripts\creds\$theName.txt"

            $secPass  = ConvertTo-SecureString $credFile -AsPlainText -Force

            Add-SCOMRunAsAccount -CommunityString -Name $theName -Description $rAccount.Description -String $secPass

        }

        'SCXMonitorRunAsAccount' {

            try{               

                $thePass = Get-Content -Path "C:\Temp\SCOM-Scripts\creds\$theName.txt"

                $secPass = ConvertTo-SecureString $thePass -AsPlainText -Force

                $theCreds = New-Object -TypeName System.Management.Automation.PSCredential($theName,$secPass)               

                Add-SCOMRunAsAccount -SCXMonitoring -Name $theName -Description $rAccount.Description -RunAsCredential $theCreds -Sudo              

            } catch {           

                Write-Warning $_                            

            }           

        }

        'SCXMaintenanceRunAsAccount' {

            try{               

                $thePass = Get-Content -Path "C:\Temp\SCOM-Scripts\creds\$theName.txt"

                $secPass = ConvertTo-SecureString $thePass -AsPlainText -Force

                $theCreds = New-Object -TypeName System.Management.Automation.PSCredential($theName,$secPass)               

                Add-SCOMRunAsAccount -SCXMaintenance -Name $theName -Description $rAccount.Description -RunAsCredential $theCreds -Sudo               

            } catch {

                Write-Warning $_                            

            }           

        }

        'SCOMBasicCredentialSecureData' {

            try{              

                $thePass = Get-Content -Path "C:\Temp\SCOM-Scripts\creds\$theName.txt"

                $secPass = ConvertTo-SecureString $thePass -AsPlainText -Force

                $theCreds = New-Object -TypeName System.Management.Automation.PSCredential($theName,$secPass)                

                Add-SCOMRunAsAccount -Basic -Name $theName -Description $rAccount.Description -RunAsCredential $theCreds                

            } catch {

                Write-Warning $_                            

            }           

        }

        'SCOMWindowsCredentialSecureData' {

            try{               

                $thePass = Get-Content -Path "C:\Temp\SCOM-Scripts\creds\$theName.txt"

                $secPass = ConvertTo-SecureString $thePass -AsPlainText -Force

                $theName  = $env:USERDOMAIN + '\' + $rAccount.Name

                $theCreds = New-Object -TypeName System.Management.Automation.PSCredential($theName,$secPass)                

                Add-SCOMRunAsAccount -Windows -Name $theName -Description $rAccount.Description -RunAsCredential $theCreds                

            } catch {

                Write-Warning $_                            

            }           

        }

        default {

            Write-Warning "Not coverd: $rAccount.AccountType please add manually."

        }

    }

}
Run As Profiles

Run As Profiles usually come with Management Packs and therefore not handled here.

Note:

Run As Profiles are the binding component between Run As Account and the Object (e.g. a Computer / Health State) they are used. Please check the configuration in the 2012 R2 environment to setup the mapping and distribution manually.

Datawarehouse Retention Settings

Datawarehouse retention settings are configured with a cmdline tool named dwdatarp.exe. It was released in 2008 and works since then.

Download link can be found here: https://blogs.technet.microsoft.com/momteam/2008/05/13/data-warehouse-data-retention-policy-dwdatarp-exe/

Checking the current settings

After downloading, copy the unpacked executable to your SCOM – database server (e.g. C:\Temp).

Open an elevated command prompt and use the following call to export the current configuration:

dwdatarp.exe -s OldSCOMDBSrvName -d OperationsManagerDW > c:\dwoutput.txt

Setting new values

The following values are just a suggestion to reduce the amount or required space.

dwdatarp.exe -s newscomdbsrv -d OperationsManagerDW -ds "Alert data set" -a "Raw data" -m 180
dwdatarp.exe -s newscomdbsrv -d OperationsManagerDW -ds "Client Monitoring data set" -a "Raw data" -m 30
dwdatarp.exe -s newscomdbsrv -d OperationsManagerDW -ds "Client Monitoring data set" -a "Daily aggregations" -m 90
dwdatarp.exe -s newscomdbsrv -d OperationsManagerDW -ds "Configuration dataset" -a "Raw data" -m 90
dwdatarp.exe -s newscomdbsrv -d OperationsManagerDW -ds "Event data set" -a "Raw Data" -m 30
dwdatarp.exe -s newscomdbsrv -d OperationsManagerDW -ds "Performance data set" -a "Raw data" -m 15
dwdatarp.exe -s newscomdbsrv -d OperationsManagerDW -ds "Performance data set" -a "Hourly aggregations" -m 30
dwdatarp.exe -s newscomdbsrv -d OperationsManagerDW -ds "Performance data set" -a "Daily aggregations" -m 90
dwdatarp.exe -s newscomdbsrv -d OperationsManagerDW -ds "State data set" -a "Raw data" -m 15
dwdatarp.exe -s newscomdbsrv -d OperationsManagerDW -ds "State data set" -a "Hourly aggregations" -m 30
dwdatarp.exe -s newscomdbsrv -d OperationsManagerDW -ds "State data set" -a "Daily aggregations" -m 90

Agent Migration – Remove the older Management Group

After the computer appears as fully managed computer in the SCOM console, remove the older Management group.

Jimmy Harper has authored a Management Pack which helps to clean-up the agent migration after the new agent has been deployed. Please read through:

https://blogs.technet.microsoft.com/jimmyharper/2016/08/13/scom-task-to-add-or-remove-management-groups-on-agents/

This is great! Thanks a lot Ruben, for taking the time and efforts to put this all together!

Get to know more about Ruben here!

Cheers!

Linux/UNIX agent failover and Resource Pools – Debunking the Myth! – Part 2

We discussed about the common misconceptions regarding failover for Windows agents in SCOM in the previous post (part 1). In this part, as promised, Stoyan Chalakov is going to be discussing about the failover for Linux machines monitored by SCOM. Here goes:

Linux/UNIX agent failover and Resource Pools – debunking the myth – Part 2

While answering questions on the Microsoft Technet Forums, I noticed that are lots of questions on topics, which are repeating, or which often are not that well understood. So I had a discussion with Sam (Sameer Mhaisekar) and we decided that it would be very beneficial if we write short blog posts on those topics.

One such topic is agent failover in SCOM and the difference between a Windows and Linux/UNIX agent. In order to understand how the Linux/UNIX agent failover works, one need to first become acquainted with the concept of Resource Pools in Operations Manager.

Sam already wrote the first part on this subject, where he explains the basic of Resource Pools and gives also example of how the failover of the Windows agent work in SCOM. He also referenced 3 very important articles about Resource Pools in SCOM, which are you need to read before getting to Linux/UNIX agent failover.

Before jumping to failover, it is important to mention some important facts about the architecture of Linux/UNIX agent.

The Linux/UNIX agent is very different from the Windows one and hasn’t been changed since Operations Manager 2012. One of the most important functional difference, compared to a Windows agent is the absence of a Health Service implementation. So, all the monitored data is passed to the Health Service on a management server, where all the management pack workflows are being run. This makes it a passive agent, which is being queried (using the WSMan protocol, Port 1270) for availability and performance data by the management servers in the Resource Pool.

The first important thing that needs mentioning is that you cannot discover and monitor UNIX/Linux systems without configuring a Resource Pool first. When you start the SCOM discovery wizard you will notice that you cannot continue unless you have selected a Resource Pool from the drop-down menu.

Here a few important considerations regarding the Resource Pool that will manage the UNIX/Linux agents:

  • It is recommended and, in my opinion, very important to dedicate the management servers in the cross platform Resource Pool only to UNIX/Linux monitoring. The reason for this are the capacity limits, which Operations Manager has when it comes to monitoring UNIX\Linux and Windows and which needs to be calculated very accurately. I will try to explain this in detail. A dedicated management server can handle up to 3000 Windows agents, but only 1000 Linux or UNIX computers. We already revealed the reason for that – cross platform workflows are being run on the management server and this costs performance.

So, if you have also Windows agents, reporting to the dedicated management server, capacity and scalability calculations cannot be made precise and the performance of the management server can be jeopardized.

This fully applies and is a must for larger organizations where there are many Linux or UNIX computers (hundreds of systems) and their number grows. In smaller monitored environments, where you have a small (tens of systems), (almost) static number of cross platform agents, very often, dedicating management servers only for those systems can be an overkill. So, in such cases, I often use management servers, which are not dedicated and are members of the Default Resource Pools or are managing Windows agents. This of course, is only possible if the number of Windows agents is way below the capacity limits of the management group, which would leave enough system Resources on the management server for running the Linux/UNIX workflows.

  • It is very important to dedicate the management server in the cross platform Resource Pool only to UNIX/Linux monitoring. This means not only that you should not assign Windows agents to report to it, but also the server should be excluded also from the other Resource Pools in the management group (SCOM Default Resource Pools, network monitoring Resource Pools, etc.). The reason is the same – performance. If the management server participates in other Resource Pools, it will execute also other types of workflows.

To exclude the management server from the Default Resource Pools, you will need to modify their membership from automatic to manual. By default, each management server, added to the management group is automatically added to the Resource Pools that have an automatic membership type. For some of the Resource Pools this can be accomplished over the console, for others like the “All Management Servers Resource Pool” this can be done only with PowerShell:

Get-SCOMResourcePool -DisplayName “All Management Servers Resource Pool” | Set-SCOMResourcePool -EnableAutomaticMembership 0

  • When you do the capacity planning for your management group, make sure you don’t forget to calculate the number of UNIX or Linux computers a management server can handle in the case another member of the Resource Pool fails. Let me explain this with an example:

According to the official documentation (see the link above), a dedicated management server can handle up to 1000 Linux or UNIX computers. But, if you have two dedicated management servers in your cross platform Resource Pool and you aim for high availability, you cannot assign 2000 (2x 1000) agents to the Pool. Why? Just imagine what will happen with a management server if its “buddy” from the same Resource Pool fails and all its agents get reassigned to the one, which is still operational. You guessed right – it will be quickly overwhelmed by all the agents and become non-operational. So, the right thing to do if you would like your cross platform monitoring to be highly available, is to have 2 management for not more than 1000 agents, so that in case of failure the remaining server can still handle the performance load.

  • Last, but not least, make sure the management servers, which will be used for Linux/UNIX monitoring are sized (RAM, CPUs, Disk space) according to the Microsoft recommendations.

Now back to the agent failover topic…I think it got already pretty clear how the Linux/UNIX agent failover happens behind the scenes, but short summary won’t do any harm:

  • After the discovery wizard is started, a Resource Pool must be selected for managing the systems.
  • When the Resource Pool is selected, it assigns one of the participating management server to complete the actual discovery of the systems and take over the monitoring.
  • When the management server fails, the Resource Pool selects one of its other members to take over the monitoring.

Here a reference to what we said in the beginning that the UNIX/Linux agent is passive and is being queried by the management server. Because of this, it is not actually aware of what happens in the background and continues to communicate with the server, which has been now assigned to it.

Now is also the right time to make a couple of very import notes:

  • XPlat (cross platform) certificates

Part of the preparation of the environment for the monitoring of cross platform systems is the creation of self-signed certificate and its deployment to every management server, member of the Resource Pool. This will ensure that in case of failover each management server will be able to communicate with agent, using the same certificate.

  • High availability with Operations Manager Gateways as members of the Resource Pool (thanks to Graham Davies for the reminder)

What I forgot to mention in the first version of this post, but is of high importance for maintaining high availability of your Gateway Resource Pools (Resource Pool, consisting of Operations Manager Gateway servers) is the fact that two Gateways are not sufficient for achieving high availability. Why? You will find the answer in the article Kevin Holman wrote about Resource Pools in SCOM and how exactly they provide high availability This is also the same article Sam posted in the first part and it is must read if you have to plan for and manage Resource Pools and cross platform agent failover in Operations Manager:

Understanding SCOM Resource Pools

https://kevinholman.com/2016/11/21/understanding-scom-resource-pools/

Conclusion

Understanding UNIX/Linux agent high availability and failover is not a hard thing to do. Still, in order to properly plan for Operations Manager cross platform monitoring, there are some additional things like sizing and scalability that need to be considered.

Awesome, thanks a lot Stoyan! It was very informative and interesting read, as usual! 🙂

You can get in touch with Stoyan on LinkedIn or Twitter, or visit his Technet profile.

Cheers!