Category: SCOM

Service Uptime Report in SCOM

This is a question that I often get asked by the customers I work with and apparently a lot of others as evident by the related questions on the forums.

One way of doing it is to author your own service monitor, but that involves considerable amount of knowledge and experience of management packs and the underlying coding. It usually takes a lot of time as well. Not everyone has the right knowledge or the time to spend on it so I thought I’d share a quick trick I do to measure uptime of a service and also be able to present it to the concerned parties in the form of a report.

It often happens that you have a service running on your servers and many organizations use it as a “proof” to show that the application was running, or maybe as analysis for troubleshooting, hence it becomes necessary to be able to measure the uptime of the service accurately and to be able to show it to the management and/or to pass it around.

The thing is, when you’re creating a monitor to measure availability of a service in SCOM, you usually choose a “Basic Service Monitor”. This monitor is not very “intelligent” and simply places an instance of itself on every server belonging to the class that you choose. However, you do have another option to monitor your service with, and it is the “Windows Service Template”. This type provides you much more features and finer control on your service monitoring. I wrote a blog earlier comparing these two options of service monitoring and when to use what.

SCOM basic service monitor Vs Windows Service Template

So the way the Windows Service template works is that it creates a discovery of it’s own and hence creates an actual class. This class can further be used to target other workflows that you may have to monitor this class of servers. Another advantage of that is you can now use this class to fetch a “Service Availability” report. For example:

Let’s say I’m monitoring the Spooler service on a bunch of servers, and I need to be able to see the uptime of this service on each of these servers. So I create a Windows Service Template monitor, call it “Test Spooler”.

Now once it’s done, go to “Discovered Inventory” tab under Monitoring. Click on “Change the target type” and you can see that now the “Test Spooler” is a class available for targeting.

This means you can also target your availability report to this target as well and measure the uptime of the service it is monitoring.

You can also fetch a report for a group of service monitors created this way. There’s a good discussion we had a while back regarding this exact requirement:

Service Availability Report

Hope that helps!

Cheers

SCOM 2012 R2 to 1801 Side-by-Side Migration : The Powershell Way! – By Ruben Zimmermann

Ruben is back with another awesome blog post, and I have no doubt this one is going to help a lot of people!

We all know the hassles of migrating your SCOM from one version to another, and it involves a lot of manual efforts. And especially so when you choose the side-by-side upgrade path. Making sure you have all the Management Packs re-imported, all overrides are intact, making sure all permissions to various users you’ve assigned over the years are still in place, etc just to mention a few examples. We’ve all wished how great it would be if you could run some kind of script and it’ll do it all for us. Well, this is exactly what Ruben has made available for us! Here, take a look:

Preface

Migrating from SCOM 2012 R2 to SCOM 1801 isn’t a stressful job as both environments can live in parallel.

This blog gives examples how to migrate the configuration from A to B by trying to leverage PowerShell whenever possible / reasonable.

Please apply the PowerShell code only if you feel comfortable with it. If there are questions, please don’t hesitate to drop me a line.

Introduction

Although it is possible letting the agent do duplicate work in the sense of executing code in management packs, sending data to different management groups can cause excessive resource consumption on the monitored computer.

I suggest:

  • Migrate configuration to 1801 from 2012 R2 (blog topic)
  • Test with a very small amount of machine of different types (Application Servers, Web Servers, MSSQL, Exchange, Domain Controllers)
  • Move the remaining, majority to 1801
  • Change connectors to ticket systems, alert devices other peripheral systems
  • Remove the agent connections from 2012 R2 (see below for details)
  • Monitor the new environment for a while, if all fine decommission 2012 R2

Requirements

  • 1801 is setup fully functional and first agents have been deployed successfully.
  • Windows PowerShell 5.1 is required on the 2012 R2 and on the 1801 server
  • Service Accounts used in 2012 R2 will be re-used. – If they are different in 1801 it is no big obstacle to change them if they are overwritten by one of the import steps.
    • Usually SCOM will alert if there is misconfiguration such a lack of permission somewhere.

Migration

Below now the steps separated in the different parts. – No guarantee for complete-  or correctness.

Management Packs

Migrating to a new environment is always a great chance to perform cleanup and apply the lessons you’ve learned in the old one.

Review

Export all Management Packs from the current SCOM environment and review.

Get-SCOMManagementPack | Select-Object -Property Name, DisplayName, TimeCreated, LastModified, Version | Export-Csv -Path C:\temp\CurrentMps.csv -NoTypeInformation

For your convenience use Excel to open the CSV file. Import only those Management Packs which bring measurable benefit.

Overrides

Standard Approach

If you have followed best practices you have created Override Management Packs for each Management Pack to store your customizations. Export those and import them into your new environment.

Note: Verify that the overrides work as expected.

Green field approach

In case you have stored the all overrides only in one Management Pack or want to be more selective follow these steps

  1. Create a new Override Management Pack for that specific MP and name it properly.
    1. E.g. <YourCompayName>.Windows.Server.Overrides

    2. Follow the steps mentioned in ‘Management Pack tuning’.
      https://anaops.com/2018/06/22/guest-blog-management-pack-tuning-ruben-zimmermann/

Notification Settings (Channels, Subscribers and Subscriptions)

Quoted from: http://realscom.blogspot.de/2017/05/migrate-notifications-to-another.html

  • Export the Microsoft.SystemCenter.Notifications.Internal mp in the old and new management groups – do not modify these, they are our backup copies
  • Make a copy of both files and rename them to something meaningful like Microsoft.SystemCenter.Notifications.Internal.ManagementGroupName_backup.xml
  • Make a note of the MP version number of the Microsoft.SystemCenter.Notifications.Internal MP from the new management group. In my case it was 7.2.11719.0
  • Open up the Microsoft.SystemCenter.Notifications.Internal.xml file for the old management group and change the version number to that of the new management group MP version + 1. In my case I changed it from 7.0.9538.0 to 7.2.11719.1. This is so the MP imports properly in the new management group

Exporting Configuration with PowerShell

Open a PowerShell console on the 2012 R2 Management Server and use the following cmdlets to store the information to C:\Temp\ResolutionStates.json

Alert Resolution State

Only user defined resolution states are exported.

Get-SCOMAlertResolutionState | Where-Object {$_.IsSystem -eq $false} | Select-Object Name, ResolutionState | ConvertTo-Json | Out-File C:\Temp\ResolutionStates.json
Alert Auto – Resolution Settings

Limited to the properties AlertAutoResolveDays and HealthyAlertAutoResolveDays

Get-SCOMAlertResolutionSetting | Select-Object AlertAutoResolveDays, HealthyAlertAutoResolveDays | ConvertTo-Json | Out-File C:\Temp\SCOMExport\AlertResolutionSetting.json
Database Grooming Settings (hint to dwarp! Plus link)

Exports data retention settings for information in the OperationsManager database. The DataWareHouse database is covered later.

Get-SCOMDatabaseGroomingSetting | Select-Object AlertDaysToKeep, AvailabilityHistoryDaysToKeep,EventDaysToKeep,JobStatusDaysToKeep,MaintenanceModeHistoryDaysToKeep,MonitoringJobDaysToKeep,PerformanceDataDaysToKeep,PerformanceSignatureDaysToKeep,StateChangeEventDaysToKeep | ConvertTo-Json | Out-file C:\Temp\SCOMExport\DatabaseGrooming.json
User Roles

Exporting User roles and System roles into dedicated files.

Get-SCOMUserRole | Where-object {$_.IsSystem -eq $false } | Select-Object -Property Name, ProfileDisplayName, Users | ConvertTo-Json | Out-File C:\Temp\SCOMExport\UserRoles.json
Get-SCOMUserRole | Where-object {$_.IsSystem -eq $true }  | Select-Object -Property Name, ProfileDisplayName, Users | ConvertTo-Json | Out-File C:\Temp\SCOMExport\SystemUserRoles.json
Run As Accounts

Exporting user defined RunAsAccounts. System defined ones are created by Management Packs.

Get-SCOMRunAsAccount | Where-Object {$_.isSystem -eq $false} | Select-Object -Property AccountType, UserName, SecureStorageId, Name, Description, SecureDataType | ConvertTo-Json | Out-File C:\Temp\SCOMExport\RunAsAccounts.json
<# The cmdlet 'Get-SCOMRunAsAccount' can't extract credential information. A free cmdlet, written by Stefan Roth is required for that. – Import it to your 2012 R2 environment before proceeding with the following lines.

https://www.powershellgallery.com/packages/RunAsAccount/1.0 #>
Get-SCOMRunAsAccount | Select-Object -ExpandProperty Name | ForEach-Object {

    try {  

        $theName  = $_

        $theTmp  = Get-RunAsCredential -Name $theName

        $thePass = ($theTmp.Password).ToString()

        $thePass




        $secPass  = ConvertTo-SecureString $thePass -AsPlainText -Force

        $theCreds = New-Object -TypeName System.Management.Automation.PSCredential($theName,$secPass)




        $theCreds | Export-Clixml -Path "C:\temp\SCOMExport\$theName.cred"

        $fileName =  "C:\temp\SCOMExport\" + $theName + ".txt"

        "$($thePass)" | Out-File -FilePath $fileName

    } catch {

        $info = 'Swallowing exception'

    }

}

Importing Configuration with PowerShell

Copy the JSON files to the 1801 Management Server (C:\Temp\SCOM-Scripts)  and import using the following code:

Alert Resolution State
$jsonFileContent = Get-Content -Path C:\Temp\SCOM-Scripts\ResolutionStates.json | ConvertFrom-Json



foreach ($aState in $jsonFileContent) { 

Add-SCOMAlertResolutionState -Name $aState.Name -ResolutionStateCode $aState.ResolutionState

}


Alert Resolution Setting
$jsonFileContent = Get-Content -Path C:\Temp\SCOM-Scripts\AlertResolutionSetting.json | ConvertFrom-Json

Set-SCOMAlertResolutionSetting -AlertAutoResolveDays $jsonFileContent.AlertAutoResolveDays -HealthyAlertAutoResolveDays $jsonFileContent.HealthyAlertAutoResolveDays

Database Grooming Settings
$jsonFileContent = Get-Content -Path C:\Temp\SCOM-Scripts\DatabaseGrooming.json | ConvertFrom-Json

Set-SCOMDatabaseGroomingSetting -AlertDaysToKeep $jsonFileContent.AlertDaysToKeep -AvailabilityHistoryDaysToKeep $jsonFileContent.AvailabilityHistoryDaysToKeep -EventDaysToKeep $jsonFileContent.EventDaysToKeep

Set-SCOMDatabaseGroomingSetting -JobStatusDaysToKeep $jsonFileContent.JobStatusDaysToKeep -MaintenanceModeHistoryDaysToKeep $jsonFileContent.MaintenanceModeHistoryDaysToKeep -MonitoringJobDaysToKeep $jsonFileContent.MonitoringJobDaysToKeep

Set-SCOMDatabaseGroomingSetting -PerformanceDataDaysToKeep $jsonFileContent.PerformanceDataDaysToKeep -PerformanceSignatureDaysToKeep $jsonFileContent.PerformanceSignatureDaysToKeep -StateChangeEventDaysToKeep $jsonFileContent.StateChangeEventDaysToKeep        

User Roles
# Importing User roles and System roles from dedicated files. – Review them first.

$jsonFileContent = Get-Content -Path C:\Temp\SCOM-Scripts\UserRoles.json | ConvertFrom-Json

foreach ($uRole in $jsonFileContent) {

    switch ($uRole.ProfileDisplayName) {       

        'Advanced Operator' {

            Add-SCOMUserRole -Name $uRole.Name -DisplayName $uRole.Name -Users $uRole.Users -AdvancedOperator           

        }       

        'Author' {

            Add-SCOMUserRole -Name $uRole.Name -DisplayName $uRole.Name -Users $uRole.Users -Author

        }

        'Operator' {

            Add-SCOMUserRole -Name $uRole.Name -DisplayName $uRole.Name -Users $uRole.Users -Operator

        }

        'Read-Only Operator' {

            Add-SCOMUserRole -Name $uRole.Name -DisplayName $uRole.Name -Users $uRole.Users -ReadOnlyOperator

        }       

    }       

}




$jsonFileContent = Get-Content -Path C:\Temp\SCOM-Scripts\SystemUserRoles.json | ConvertFrom-Json

foreach ($sRole in $jsonFileContent) {

    if ($sRole.Users) {   

        Get-SCOMUserRole | Where-Object {$_.Name -eq $sRole.Name} | Set-SCOMUserRole -User $sRole.Users

    } else {

        continue

    }    

}
Run As Accounts
# Importing User roles and System roles from dedicated files. – Review them first.

$jsonFileContent = Get-Content -Path C:\Temp\SCOM-Scripts\RunAsAccounts.json | ConvertFrom-Json



foreach($rAccount in $jsonFileContent) {




    $theName  = $rAccount.Name   

    if ($theName -match 'Data Warehouse') {

        write-warning "Skipping default account $theName"

        continue

    }

    switch ($rAccount.AccountType) {

        'SCOMCommunityStringSecureData' {                       

            $credFile =  Get-Content -Path "C:\Temp\SCOM-Scripts\creds\$theName.txt"

            $secPass  = ConvertTo-SecureString $credFile -AsPlainText -Force

            Add-SCOMRunAsAccount -CommunityString -Name $theName -Description $rAccount.Description -String $secPass

        }

        'SCXMonitorRunAsAccount' {

            try{               

                $thePass = Get-Content -Path "C:\Temp\SCOM-Scripts\creds\$theName.txt"

                $secPass = ConvertTo-SecureString $thePass -AsPlainText -Force

                $theCreds = New-Object -TypeName System.Management.Automation.PSCredential($theName,$secPass)               

                Add-SCOMRunAsAccount -SCXMonitoring -Name $theName -Description $rAccount.Description -RunAsCredential $theCreds -Sudo              

            } catch {           

                Write-Warning $_                            

            }           

        }

        'SCXMaintenanceRunAsAccount' {

            try{               

                $thePass = Get-Content -Path "C:\Temp\SCOM-Scripts\creds\$theName.txt"

                $secPass = ConvertTo-SecureString $thePass -AsPlainText -Force

                $theCreds = New-Object -TypeName System.Management.Automation.PSCredential($theName,$secPass)               

                Add-SCOMRunAsAccount -SCXMaintenance -Name $theName -Description $rAccount.Description -RunAsCredential $theCreds -Sudo               

            } catch {

                Write-Warning $_                            

            }           

        }

        'SCOMBasicCredentialSecureData' {

            try{              

                $thePass = Get-Content -Path "C:\Temp\SCOM-Scripts\creds\$theName.txt"

                $secPass = ConvertTo-SecureString $thePass -AsPlainText -Force

                $theCreds = New-Object -TypeName System.Management.Automation.PSCredential($theName,$secPass)                

                Add-SCOMRunAsAccount -Basic -Name $theName -Description $rAccount.Description -RunAsCredential $theCreds                

            } catch {

                Write-Warning $_                            

            }           

        }

        'SCOMWindowsCredentialSecureData' {

            try{               

                $thePass = Get-Content -Path "C:\Temp\SCOM-Scripts\creds\$theName.txt"

                $secPass = ConvertTo-SecureString $thePass -AsPlainText -Force

                $theName  = $env:USERDOMAIN + '\' + $rAccount.Name

                $theCreds = New-Object -TypeName System.Management.Automation.PSCredential($theName,$secPass)                

                Add-SCOMRunAsAccount -Windows -Name $theName -Description $rAccount.Description -RunAsCredential $theCreds                

            } catch {

                Write-Warning $_                            

            }           

        }

        default {

            Write-Warning "Not coverd: $rAccount.AccountType please add manually."

        }

    }

}
Run As Profiles

Run As Profiles usually come with Management Packs and therefore not handled here.

Note:

Run As Profiles are the binding component between Run As Account and the Object (e.g. a Computer / Health State) they are used. Please check the configuration in the 2012 R2 environment to setup the mapping and distribution manually.

Datawarehouse Retention Settings

Datawarehouse retention settings are configured with a cmdline tool named dwdatarp.exe. It was released in 2008 and works since then.

Download link can be found here: https://blogs.technet.microsoft.com/momteam/2008/05/13/data-warehouse-data-retention-policy-dwdatarp-exe/

Checking the current settings

After downloading, copy the unpacked executable to your SCOM – database server (e.g. C:\Temp).

Open an elevated command prompt and use the following call to export the current configuration:

dwdatarp.exe -s OldSCOMDBSrvName -d OperationsManagerDW > c:\dwoutput.txt

Setting new values

The following values are just a suggestion to reduce the amount or required space.

dwdatarp.exe -s newscomdbsrv -d OperationsManagerDW -ds "Alert data set" -a "Raw data" -m 180
dwdatarp.exe -s newscomdbsrv -d OperationsManagerDW -ds "Client Monitoring data set" -a "Raw data" -m 30
dwdatarp.exe -s newscomdbsrv -d OperationsManagerDW -ds "Client Monitoring data set" -a "Daily aggregations" -m 90
dwdatarp.exe -s newscomdbsrv -d OperationsManagerDW -ds "Configuration dataset" -a "Raw data" -m 90
dwdatarp.exe -s newscomdbsrv -d OperationsManagerDW -ds "Event data set" -a "Raw Data" -m 30
dwdatarp.exe -s newscomdbsrv -d OperationsManagerDW -ds "Performance data set" -a "Raw data" -m 15
dwdatarp.exe -s newscomdbsrv -d OperationsManagerDW -ds "Performance data set" -a "Hourly aggregations" -m 30
dwdatarp.exe -s newscomdbsrv -d OperationsManagerDW -ds "Performance data set" -a "Daily aggregations" -m 90
dwdatarp.exe -s newscomdbsrv -d OperationsManagerDW -ds "State data set" -a "Raw data" -m 15
dwdatarp.exe -s newscomdbsrv -d OperationsManagerDW -ds "State data set" -a "Hourly aggregations" -m 30
dwdatarp.exe -s newscomdbsrv -d OperationsManagerDW -ds "State data set" -a "Daily aggregations" -m 90

Agent Migration – Remove the older Management Group

After the computer appears as fully managed computer in the SCOM console, remove the older Management group.

Jimmy Harper has authored a Management Pack which helps to clean-up the agent migration after the new agent has been deployed. Please read through:

https://blogs.technet.microsoft.com/jimmyharper/2016/08/13/scom-task-to-add-or-remove-management-groups-on-agents/

This is great! Thanks a lot Ruben, for taking the time and efforts to put this all together!

Get to know more about Ruben here!

Cheers!

Linux/UNIX agent failover and Resource Pools – Debunking the Myth! – Part 2

We discussed about the common misconceptions regarding failover for Windows agents in SCOM in the previous post (part 1). In this part, as promised, Stoyan Chalakov is going to be discussing about the failover for Linux machines monitored by SCOM. Here goes:

Linux/UNIX agent failover and Resource Pools – debunking the myth – Part 2

While answering questions on the Microsoft Technet Forums, I noticed that are lots of questions on topics, which are repeating, or which often are not that well understood. So I had a discussion with Sam (Sameer Mhaisekar) and we decided that it would be very beneficial if we write short blog posts on those topics.

One such topic is agent failover in SCOM and the difference between a Windows and Linux/UNIX agent. In order to understand how the Linux/UNIX agent failover works, one need to first become acquainted with the concept of Resource Pools in Operations Manager.

Sam already wrote the first part on this subject, where he explains the basic of Resource Pools and gives also example of how the failover of the Windows agent work in SCOM. He also referenced 3 very important articles about Resource Pools in SCOM, which are you need to read before getting to Linux/UNIX agent failover.

Before jumping to failover, it is important to mention some important facts about the architecture of Linux/UNIX agent.

The Linux/UNIX agent is very different from the Windows one and hasn’t been changed since Operations Manager 2012. One of the most important functional difference, compared to a Windows agent is the absence of a Health Service implementation. So, all the monitored data is passed to the Health Service on a management server, where all the management pack workflows are being run. This makes it a passive agent, which is being queried (using the WSMan protocol, Port 1270) for availability and performance data by the management servers in the Resource Pool.

The first important thing that needs mentioning is that you cannot discover and monitor UNIX/Linux systems without configuring a Resource Pool first. When you start the SCOM discovery wizard you will notice that you cannot continue unless you have selected a Resource Pool from the drop-down menu.

Here a few important considerations regarding the Resource Pool that will manage the UNIX/Linux agents:

  • It is recommended and, in my opinion, very important to dedicate the management servers in the cross platform Resource Pool only to UNIX/Linux monitoring. The reason for this are the capacity limits, which Operations Manager has when it comes to monitoring UNIX\Linux and Windows and which needs to be calculated very accurately. I will try to explain this in detail. A dedicated management server can handle up to 3000 Windows agents, but only 1000 Linux or UNIX computers. We already revealed the reason for that – cross platform workflows are being run on the management server and this costs performance.

So, if you have also Windows agents, reporting to the dedicated management server, capacity and scalability calculations cannot be made precise and the performance of the management server can be jeopardized.

This fully applies and is a must for larger organizations where there are many Linux or UNIX computers (hundreds of systems) and their number grows. In smaller monitored environments, where you have a small (tens of systems), (almost) static number of cross platform agents, very often, dedicating management servers only for those systems can be an overkill. So, in such cases, I often use management servers, which are not dedicated and are members of the Default Resource Pools or are managing Windows agents. This of course, is only possible if the number of Windows agents is way below the capacity limits of the management group, which would leave enough system Resources on the management server for running the Linux/UNIX workflows.

  • It is very important to dedicate the management server in the cross platform Resource Pool only to UNIX/Linux monitoring. This means not only that you should not assign Windows agents to report to it, but also the server should be excluded also from the other Resource Pools in the management group (SCOM Default Resource Pools, network monitoring Resource Pools, etc.). The reason is the same – performance. If the management server participates in other Resource Pools, it will execute also other types of workflows.

To exclude the management server from the Default Resource Pools, you will need to modify their membership from automatic to manual. By default, each management server, added to the management group is automatically added to the Resource Pools that have an automatic membership type. For some of the Resource Pools this can be accomplished over the console, for others like the “All Management Servers Resource Pool” this can be done only with PowerShell:

Get-SCOMResourcePool -DisplayName “All Management Servers Resource Pool” | Set-SCOMResourcePool -EnableAutomaticMembership 0

  • When you do the capacity planning for your management group, make sure you don’t forget to calculate the number of UNIX or Linux computers a management server can handle in the case another member of the Resource Pool fails. Let me explain this with an example:

According to the official documentation (see the link above), a dedicated management server can handle up to 1000 Linux or UNIX computers. But, if you have two dedicated management servers in your cross platform Resource Pool and you aim for high availability, you cannot assign 2000 (2x 1000) agents to the Pool. Why? Just imagine what will happen with a management server if its “buddy” from the same Resource Pool fails and all its agents get reassigned to the one, which is still operational. You guessed right – it will be quickly overwhelmed by all the agents and become non-operational. So, the right thing to do if you would like your cross platform monitoring to be highly available, is to have 2 management for not more than 1000 agents, so that in case of failure the remaining server can still handle the performance load.

  • Last, but not least, make sure the management servers, which will be used for Linux/UNIX monitoring are sized (RAM, CPUs, Disk space) according to the Microsoft recommendations.

Now back to the agent failover topic…I think it got already pretty clear how the Linux/UNIX agent failover happens behind the scenes, but short summary won’t do any harm:

  • After the discovery wizard is started, a Resource Pool must be selected for managing the systems.
  • When the Resource Pool is selected, it assigns one of the participating management server to complete the actual discovery of the systems and take over the monitoring.
  • When the management server fails, the Resource Pool selects one of its other members to take over the monitoring.

Here a reference to what we said in the beginning that the UNIX/Linux agent is passive and is being queried by the management server. Because of this, it is not actually aware of what happens in the background and continues to communicate with the server, which has been now assigned to it.

Now is also the right time to make a couple of very import notes:

  • XPlat (cross platform) certificates

Part of the preparation of the environment for the monitoring of cross platform systems is the creation of self-signed certificate and its deployment to every management server, member of the Resource Pool. This will ensure that in case of failover each management server will be able to communicate with agent, using the same certificate.

  • High availability with Operations Manager Gateways as members of the Resource Pool (thanks to Graham Davies for the reminder)

What I forgot to mention in the first version of this post, but is of high importance for maintaining high availability of your Gateway Resource Pools (Resource Pool, consisting of Operations Manager Gateway servers) is the fact that two Gateways are not sufficient for achieving high availability. Why? You will find the answer in the article Kevin Holman wrote about Resource Pools in SCOM and how exactly they provide high availability This is also the same article Sam posted in the first part and it is must read if you have to plan for and manage Resource Pools and cross platform agent failover in Operations Manager:

Understanding SCOM Resource Pools

https://kevinholman.com/2016/11/21/understanding-scom-resource-pools/

Conclusion

Understanding UNIX/Linux agent high availability and failover is not a hard thing to do. Still, in order to properly plan for Operations Manager cross platform monitoring, there are some additional things like sizing and scalability that need to be considered.

Awesome, thanks a lot Stoyan! It was very informative and interesting read, as usual! 🙂

You can get in touch with Stoyan on LinkedIn or Twitter, or visit his Technet profile.

Cheers!

Autogrowth on SCOM Operational DB?

This is another of the hot topics I find with differences in opinion among the experts.

The other one we discussed was Windows Agents and Failover – Debunking the Myth!

Should you enable autogrowth on SCOM Operational Database?

I did some some research online and consulted some of the best SCOM experts I know and put together an article that explains why you would NOT want to autogrow your SCOM DB.

The short version is:

DO NOT autogrow your SCOM Operational DB, unless you absolutely need to. Autogrowing DB comes with its own set of disadvantages and might affect the performance of the DB.

So, choose the size of your DB very carefully while you are designing your Management Group!

The longer and more detailed version is here:

Should You Enable Autogrowth on SCOM Operations Database?

Cheers!

PS. Special thanks to Stoyan Chalakov and “SCOM Bob” Cornelissen for reviewing the article and suggesting edits! 🙂

Windows Agents and Failover – Debunking the Myth!

The myth: “If the primary Management Server is down, the windows agents will automatically failover to any Management Server in the Resource Pool.”

It’s been 6 years since the release of SCOM 2012, and yet, the understanding around the failover process in SCOM is still widely confused. SCOM 2012 came out with the concept of the “Resource Pools”, essentially replacing and enhancing the previous “Root Management Server” concept. Having said that, the Resource Pools are still very widely misunderstood and confused.

Why was the concept of Resource Pools introduced? For failover? Sure, but probably not in the way you are thinking. I talk very frequently to other SCOMers in person and online and I often find that their understanding about the Resource Pools is not very accurate. So, I thought about writing a two-part blog explaining the failover process in SCOM – one for the Windows Agents and other for Unix/Linux and network agents.

I will talk about the Windows Agents failover part here, and my friend Stoyan Chalakov was generous enough to agree to write on the U/L and networking part. So, let’s get started!

Before we jump into the actual failover process, let’s recap briefly what Resource Pools are and what do they do.

Basically, the concept of resource pools was introduced to eliminate the Root Management Server as the single point of failure. Till SCOM 2007, RMS was the boss and other MS were under it in the management group hierarchy. Many critical workflows were specifically targeted at the RMS and so there was a risk of your SCOM being paralyzed if the RMS goes down. On top of that, you couldn’t cluster it either.

So starting from SCOM 2012 Microsoft came up with the concept of Resource Pools, and the idea that all the Management Servers are peers, and not in hierarchy. That simplified so many things and the workflows that were running on the RMS were now running on the members of the Resource Pools.

When you install SCOM, out-of-the-box you get 3 default Resource Pools – The “All Management Servers Resource Pool”, which deals with most of the legacy RMS workflows, the “Notifications Resource Pool”, which deals with notifications (alerts subscription service), and “AD Integration Assignment Pool”, which deals with the AD Integrations.

Now the scope of this blog is not to get into much detail of Resource Pools, but there are actually a couple of very good blogs out there that discuss Resource Pools in great details. The one we’ll discuss about here is in particular the “All Management Servers Resource Pool”, and specifically what it DOES NOT do.

Some reading material on Resource Pools:
Understanding SCOM Resource Pools

Resource pool design considerations

OpsMgr (#SCOM) Resources Pools–What they do not do [#SYSCTR]

Now coming back to the failover thing – I’m sure most of you have read or known that the Resource Pools provide failover and high availability in SCOM. Which is true. But again you may also be thinking that the Resource Pools (notably the All Management Servers Resource Pool) provides failover to your Windows Agents. This is simply not true. In almost all of the blogs and even in the Microsoft official documents, when you’re reading about Resource Pools, there is a line mentioned somewhere, “Windows agents do not report to resource pools” – and that’s it. Nothing else. No further explanation, no further discussions at all. That is why it is often just skimmed over or simply forgotten.

So, what does “Windows agents do not report to resource pools” actually mean?

Let’s have a case:

3 Management Servers: MS1, MS2, MS3

2 Gateway Servers: GW1 (reports to MS1) and GW2 (reports to MS2)

As the name suggests, we have all the MS in the “All Management Servers Resource Pool”.

Now let’s understand how the failover takes place should the MS or GW go down.

Case 1: Management Server goes down –

Let’s say it’s the MS3 that failed. All the agents reporting to MS3 are RANDOMLY failed over to either MS1 or MS2 (for successful failover, of course you need the required port 5723 open to all MS). This is the out-of-the-box feature of SCOM and does not require you to set up AD Integration. This process is random by default, but you CAN configure which Management Server you want it to failover to, using Powershell:

$agents = Get-SCOMAgent
$pri = Get-SCOMManagementServer -Name "MS3"
$sec = Get-SCOMManagementServer -Name "MS1"
$agents | where {$_.PrimaryManagementServerName -eq $pri.Name} | Set-SCOMParentManagementServer -PrimaryServer $pri -FailoverServer $sec

Now, once you run this the agents will failover to the Management Server YOU want, instead of failing over randomly. This is NOT affected by what Management Servers you have in whatever Resource Pool. Let’s say I removed one (or all) Management Server(s) from the All Management Servers Resource Pool, this behavior is NOT affected (Don’t do that though, it’ll cause other problems!). The servers will still failover to any Management Servers in the Management Group.

When you install a Windows agent, you configure it to report to a particular Management Server (or GW) only. The Resource Pool simply doesn’t play a role here.

In conclusion, Windows agents will failover to any available Management Server RANDOMLY (unless explicitly configured) and this behavior is NOT affected by any Resource Pools (default or custom).

Case 2: Management Server with a GW reporting goes down –

Let’s say MS1 goes down. GW failovers are not automatic and unlike agents, they DO NOT failover randomly to any other available MS. You need to configure the GW explicitly for failover to MS2 or MS3, using Powershell.

$primaryMS = Get-SCOMManagementServer | where {$_.Name –match "MS1"} 
$failoverMS = Get-SCOMManagementServer | where {$_.Name –match "MS2"} 
$gatewayMS = Get-SCOMManagementServer | where {$_.IsGateway -eq $true} 
Set-SCOMParentManagementServer -GatewayServer: $gatewayMS -PrimaryServer: $primaryMS 
Set-SCOMParentManagementServer -GatewayServer: $gatewayMS -FailoverServer: $failoverMS

Case 3: The GW server goes down –

Let’s say GW1 goes down. Again, the agents will NOT automatically failover to another GW server. You will need to configure the agents to use another server (GW2), using a Powershell script.

#Agents reporting to "GW1" – Failover to "GW2" 
$primaryMS = Get-SCOMManagementServer | where {$_.Name –eq "GW1"} 
$failoverMS = Get-SCOMManagementServer | where {$_.Name –eq "GW2"} 
$agent = Get-SCOMAgent | where {$_.PrimaryManagementServerName -eq "GW1"} 
Set-SCOMParentManagementServer -Agent: $agent -PrimaryServer: $primaryMS 
Set-SCOMParentManagementServer -Agent: $agent -FailoverServer: $failoverMS

Note: Scripts are for example only. You may need to modify them according to your requirements.

Now you’re probably thinking, how come people say that the Resource Pools are used for Failover and high availability then? Fair question! The answer is, they do provide automatic failover to the workflows that are running on the health services of the members of the resource pools. Windows agents run their workloads on their respective health services local to them; hence they have no relationship with the Resource Pools.

In other words you can also say that the failover and high availability resource pools provide is actually for Management Servers, and not for the Windows agents reporting to them.

However, this is not the case with Unix/Linux agents. I will not go into details of it here though, because Stoyan will have an entire blog dedicated to this in part 2, so I’ll let him dive into the details. 😉

Hope this clarifies some misunderstandings and helps someone out there plan their deployment correctly!

Cheers!

Cyril Azoulay – An expert behind the mask! – [Interview]

I am a regular visitor/contributor to the SCOM Technet Forums and regularly meet many experts from around the globe there. Fortunately for me I eventually became friends/acquaintances with many of them. One of them is Cyril! If you have posted a question in the forums, you may very well have received the solution from a “CyrAz”, that’s him! 😉

I was able to convince Cyril to feature in one of our community experts interview series, and from his answers you can easily guess the knowledge and experience Cyril possesses! Enough chatter from me, I will let Cyril do the talking now:

Q: Hi Cyril! Thanks for agreeing to this. I see you very often on the SCOM Technet forums, and from what I can tell, you really know what you’re talking about! Can you please tell us a little about yourself and your professional journey so far?
A: Thanks for having me 🙂
I’m a system consultant and I almost exclusively work on Microsoft technologies. I work on quite a lot of them (Active Directory, PKI, Exchange, Direct Access, HyperV/Storage Space direct and even Azure Stack these days) but my current field of expertise is System Center, and more specifically SCOM and Orchestrator.
I started working on these two technologies around 6 years ago when my boss wanted me to specialize on something and asked me if I was interested in System Center. I said yes, because I was interested in the whole “everything is integrated together” aspect of the suite, being able to manage every aspect of the datacenter etc. I quickly realized this ideal world only existed in Microsoft’s presentations and that most of these products were actually not designed to run together and were even pretty clunky by themselves, but I still really liked what they allowed to do!
So I went from customer to customer, doing things that were more and more complicated, from daily operator tasks to architecture design and complex MP development; until now where I believe I can call myself an “expert”, even though I’m still learning things every day. I’ve only started using Visual Studio for MP developments a little bit over a year ago!
And I can say that blogs and Technet forums helped me a lot during this journey, it’s really nice to have such a knowledgeable and sharing  community!
Q: What is your opinion about Microsoft’s strategy of 2 different models for System Center products? What do you prefer, the LTSC or SAC model?

A: I’m glad Microsoft is still developing SCOM, and I can’t wait to get the most recent improvements so I definitely prefer SAC. But to be honest, I’m a bit disappointed by what was added in 1801 and 1807… I’m convinced SCOM has great foundations, the Management Pack system is incredibly powerful and flexible, but there is so much that could be done to improve the final product!

Now from my customers point of view, it really depends on how much they can afford in maintaining their environments… if they have a dedicated team that has enough knowledge, they can go for SAC. If they rely exclusively on me coming from time to time to check that everything is running properly, that may not be the best idea.

Q: How do you feel about the whole SCOM vs. OMS argument? Will OMS be a viable “replacement” for SCOM?
A: I must say I really like Log Analytics and what it is becoming, especially since Kusto query language was released. I was already a huge fan of competing products such as Splunk, and that pretty much sums up what I think about the SCOM vs Log Analytics argument : it has no reason to exist since they are not competitors!
Log Analytics is becoming a great tool for… well, ingesting, archiving and analyzing logs of all kind.
But for now it is a terrible replacement for SCOM, regardless of what Microsoft is trying to make us believe! It doesn’t provide any solution to replace the huge list of existing management packs and everything they include, it doesn’t provide an easy solution to develop your own monitoring, it has terrible SLA regarding data ingestion latency…
However, I can very well see how they can work together to provide more information on your environment, and how Log Analytics can become a good monitoring tool someday. But definitely not today, and probably not tomorrow neither.
Q: Given a chance, what would you like to ask or advise to Microsoft?
A: Don’t abandon great tools that can still provide tremendous service to many people just because you believe they are not the future! And maybe try to make them a little bit more usable by non experts…
Q: Lastly, the traditional question! Star Wars or Star Trek?
A: A 4-cheese pizza and a cold beer, thanks 🙂
Awesome! Thanks again for sharing your expertise with us Cyril, and I hope you will continue to help the folks on the forums! 😀
You can also get in touch with him on LinkedIn:
You can also visit his blog (in French) here:
Cheers!
Sam

Authoring PowerShell SCOM Console Tasks in XML – Ruben Zimmermann

Summary:

This post describes how to create PowerShell SCOM Console Tasks in XML along three examples.

Console-ListTopNProcesses

Introduction:

Console Tasks are executed on the SCOM Management Server. Three examples show how to create them using Visual Studio.

  • Task 1: Displaying a Management Server’s Last-Boot-Time [DisplayLastBootTime]
    • Executes a PowerShell script which displays the result in a MessageBox
  • Task 2: Show all new alerts related to selected Computer [ShowNewAlerts]
    • Passes the Computer principal name property (aka FQDN) to the script which then uses a GridView to display the alerts
  • Task 3: Listing the top N processes running on a Management Server [ListTopNProcesses]
    • An InputBox let the user specify the number (N) of top process on the Management server which is retrieved by script and shown via GridView

Requirements:

This blog assumes that you have created already a management pack before. If not, or if you feel difficulties please visit the section ‘Reading’ and go through the links.

The used software is Visual Studio 2017 (2013 and 2015 should work as well) plus the Visual Studio Authoring Extension. Additionally, the ‘PowerShell Tools for Visual Studio 2017’ are installed.

The Community Edition of Visual Studio technically works. – Please check the license terms in advance. – If you are working for a ‘normal company’ you will most likely require Visual Studio Professional.

Realization:

Initial steps in Visual Studio

à Create a new project based on Management Pack / Operations Manager 2012 R2 Management Pack template.

à Name it SCOM.Custom.ConsoleTasks

à Create a folder named Health Model and a sub folder of it named Tasks

à Add an Empty Management Pack Fragment to the root and name it Project.mpx.

Project File Project.mpx Content:

<ManagementPackFragment SchemaVersion="2.0" xmlns:xsd="http://www.w3.org/2001/XMLSchema">

  <LanguagePacks>

    <LanguagePack ID="ENU" IsDefault="true">

      <DisplayStrings>

        <DisplayString ElementID="SCOM.Custom.ConsoleTasks">

          <Name>SCOM Custom ConsoleTasks</Name>

        </DisplayString>

      </DisplayStrings>

    </LanguagePack>

  </LanguagePacks>

</ManagementPackFragment>

Your screen should look like this now:

VSAE-ProjectFile

VSAE-ProjectFile.gif

Creating DisplayLastBootTime task

Within the Tasks folder create a class file named ConsoleTasks.mpx and remove all XML insight the file.

ConsoleTasks.mpx firstly only contains the code for the task DisplayLastBootTime and will be successively extended with the other two tasks.

 

Content of ConsoleTasks.mpx :

<ManagementPackFragment SchemaVersion="2.0" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <Categories>

    <Category ID ="SCOM.Custom.ConsoleTasks.DisplayLastBootTime.ConsoleTaskCategory"  Target="SCOM.Custom.ConsoleTasks.DisplayLastBootTime.ConsoleTask" Value ="System!System.Internal.ManagementPack.ConsoleTasks.MonitoringObject"/>

  </Categories>
  <Presentation>

    <ConsoleTasks>      

      <ConsoleTask ID="SCOM.Custom.ConsoleTasks.DisplayLastBootTime.ConsoleTask" Accessibility="Public" Enabled="true" Target="SC!Microsoft.SystemCenter.RootManagementServer" RequireOutput="false">        <Assembly>SCOM.Custom.ConsoleTasks.DisplayLastBootTime.Assembly</Assembly>

        <Handler>ShellHandler</Handler>

        <Parameters>

          <Argument Name="WorkingDirectory" />

          <Argument Name="Application">powershell.exe</Argument>

          <Argument><![CDATA[-noprofile -Command "& { $IncludeFileContent/Health Model/Tasks/DisplayManagementServerLastBootTime.ps1$ }"]]></Argument>          

        </Parameters>

      </ConsoleTask>    

    </ConsoleTasks>

  </Presentation>
  <LanguagePacks>

    <LanguagePack ID="ENU" IsDefault="true">      

      <DisplayStrings>        

        <DisplayString ElementID="SCOM.Custom.ConsoleTasks.DisplayLastBootTime.ConsoleTask">

          <Name>Custom Console-Tasks: Display LastBootTime</Name>

          <Description></Description>

        </DisplayString>        

      </DisplayStrings>

    </LanguagePack>

  </LanguagePacks>
  <Resources>

    <Assembly ID ="SCOM.Custom.ConsoleTasks.DisplayLastBootTime.Assembly" Accessibility="Public" FileName ="SCOM.Custom.ConsoleTasks.DisplayLastBootTime.Assembly.File" HasNullStream ="true" QualifiedName ="SCOM.Custom.ConsoleTasks.DisplayLastBootTime.Assembly" />

  </Resources>

</ManagementPackFragment>

 

Key points explanation for ConsoleTasks.mpx

Categories

  • Category Value specifies that this element is a console task

Presentation

ConsoleTask Target defines against what this task is launched. In this case it’s the Management Server. You can only see the task if you click on a View that is targeting the corresponding class. E.g.:

Console-ManagementServer-Tasks

Console-ManagementServer-Tasks.gif
  • Parameters, Argument Name “Application” sets PowerShell.exe to be called for execution
  • Parameters, Argument <![CDATA … defines the file within this Visual Studio project that contains the PowerShell code to be processed when the task is launched ( not handled yet ).

LanguagePack

  • DisplayString maps the Console Task ID to a text that is more user friendly. This will be shown in the SCOM console

Within the Tasks folder create a PowerShell file named DisplayManagementServerLastBootTime.ps1

Content of  DisplayManagementServerLastBootTime.ps1 :
$regPat         = '[0-9]{8}'

$bootInfo       = wmic os get lastbootuptime

$bootDateNumber = Select-String -InputObject $bootInfo -Pattern $regPat | Select-Object -ExpandProperty Matches | Select-Object -ExpandProperty Value

$bootDate       = ([DateTime]::ParseExact($bootDateNumber,'yyyyMMdd',[Globalization.CultureInfo]::InvariantCulture))

$lastBootTime   = $bootDate | Get-Date -Format 'yyyy-MM-dd'




$null = [System.Reflection.Assembly]::LoadWithPartialName('Microsoft.VisualBasic')

$null = [Microsoft.VisualBasic.Interaction]::MsgBox($lastBootTime,0,"""Last Boot time of $($env:COMPUTERNAME)""")

 

Deploy this Management Pack to the SCOM Server and test it. The following screenshot shows the expected result:
Console-DisplayLastBootTime

Console-DisplayLastBootTime.gif

Creating ShowNewAlerts task

Keep the content of ConsoleTasks.mpx unchanged. Add the new code into the proper places.

Additional Content for ConsoleTasks.mpx for ShowNewAlerts:

<Category ID ="SCOM.Custom.ConsoleTasks.ShowNewAlerts.ConsoleTaskCategory"  Target="SCOM.Custom.ConsoleTasks.ShowNewAlerts.ConsoleTask" Value ="System!System.Internal.ManagementPack.ConsoleTasks.MonitoringObject"/><ConsoleTask ID="SCOM.Custom.ConsoleTasks.ShowNewAlerts.ConsoleTask" Accessibility="Public" Enabled="true" Target="Windows!Microsoft.Windows.Computer" RequireOutput="false">

        <Assembly>SCOM.Custom.ConsoleTasks.ShowNewAlerts.Assembly</Assembly>

        <Handler>ShellHandler</Handler>

        <Parameters>

          <Argument Name="WorkingDirectory" />

          <Argument Name="Application">powershell.exe</Argument>

          <Argument><![CDATA[-noprofile -noexit -Command "& { $IncludeFileContent/Health Model/Tasks/ShowNewAlertsForThisComputer.ps1$ }"]]></Argument>          <Argument>"$Target/Property[Type='Windows!Microsoft.Windows.Computer']/PrincipalName$"</Argument>

        </Parameters>

      </ConsoleTask>
<DisplayString ElementID="SCOM.Custom.ConsoleTasks.ShowNewAlerts.ConsoleTask">

          <Name>Custom Console-Tasks: Display ShowNewAlertsForThisComputer</Name>

          <Description></Description>

        </DisplayString>

<Assembly ID ="SCOM.Custom.ConsoleTasks.ShowNewAlerts.Assembly" Accessibility="Public" FileName ="SCOM.Custom.ConsoleTasks.ShowNewAlerts.Assembly.File" HasNullStream ="true" QualifiedName ="SCOM.Custom.ConsoleTasks.ShowNewAlerts.Assembly" />

Key points explanation for ConsoleTasks.mpx for ShowNewAlerts

Categories

  • Category Value specifies that this element is a console task

Presentation

  • ConsoleTask Target defines against what this task is launched. In this case it’s the Windows Computer. – This task is visible when in a View that shows Computer objects.
  • Parameters, Argument Name “Application” sets PowerShell.exe to be called for execution
  • Parameters, Argument <![CDATA … defines the file within this Visual Studio project that contains the PowerShell code to be processed when the task is launched ( not handled yet ).
  • Parameters, Argument “$Target/Property […]/PrincipalName$” gets the FQDN (principal name attribute) of the selected computer and makes it available for retrieving it in the script.

 

Content of ShowNewAlertsForThisComputer.ps1 :
param($ComputerName)

$allNewAlerts = Get-SCOMAlert | Select-Object -Property Name, Description, MonitoringObjectDisplayName, IsMonitorAlert, ResolutionState, Severity, PrincipalName, TimeRaised  | Where-Object {$_.PrincipalName -eq $ComputerName -and $_.ResolutionState -eq '0'}




if ($allNewAlerts) {

       $allNewAlerts | Out-GridView

} else {

       Write-Host 'No new alerts available for the computer: ' + $ComputerName

}

 

Deploy this Management Pack to the SCOM Server and test it. The following screenshot shows the expected result:

Console-ShowNewAlerts

Console-ShowNewAlerts

Creating ListTopNProcesses task

Keep the content of ConsoleTasks.mpx unchanged. Add the new code into the proper places.

Additional Content for ConsoleTasks.mpx for ListTopNProcesses:

<Category ID ="SCOM.Custom.ConsoleTasks.ListTopNProcesses.ConsoleTaskCategory"  Target="SCOM.Custom.ConsoleTasks.ListTopNProcesses.ConsoleTask" Value ="System!System.Internal.ManagementPack.ConsoleTasks.MonitoringObject"/>  <ConsoleTask ID="SCOM.Custom.ConsoleTasks.ListTopNProcesses.ConsoleTask" Accessibility="Public" Enabled="true" Target="SC!Microsoft.SystemCenter.RootManagementServer" RequireOutput="false">

        <Assembly>SCOM.Custom.ConsoleTasks.ListTopNProcesses.Assembly</Assembly>

        <Handler>ShellHandler</Handler>

        <Parameters>

          <Argument Name="WorkingDirectory" />

          <Argument Name="Application">powershell.exe</Argument>

          <Argument><![CDATA[-noprofile -noexit -Command "& { $IncludeFileContent/Health Model/Tasks/ListTopNProcesses.ps1$ }"]]></Argument>

        </Parameters>

      </ConsoleTask>

<DisplayString ElementID="SCOM.Custom.ConsoleTasks.ListTopNProcesses.ConsoleTask">

          <Name>Custom Console-Tasks: Display ListTopNProcesses</Name>

          <Description></Description>

        </DisplayString>     

<Assembly ID ="SCOM.Custom.ConsoleTasks.ListTopNProcesses.Assembly" Accessibility="Public" FileName ="SCOM.Custom.ConsoleTasks.ListTopNProcesses.Assembly.File" HasNullStream ="true" QualifiedName ="SCOM.Custom.ConsoleTasks.ListTopNProcesses.Assembly" />

Key points explanation for ConsoleTasks.mpx for ListTopNProcesses

Categories

  • Category Value specifies that this element is a console task

Presentation

  • ConsoleTask Target defines against what this task is launched. In this case it’s the Windows Computer. – This task is visible when in a View that shows Computer objects.
  • Parameters, Argument Name “Application” sets PowerShell.exe to be called for execution
  • Parameters, Argument <![CDATA … defines the file within this Visual Studio project that contains the PowerShell code to be processed when the task is launched ( not handled yet ).

 

Content of ListTopNProcesses.ps1 :
$null          = [System.Reflection.Assembly]::LoadWithPartialName('Microsoft.VisualBasic')

$receivedValue = [Microsoft.VisualBasic.Interaction]::InputBox("""Enter the number of top processes to be shown""", """List top processes:""", "10")

Get-Process | Sort-Object -Property CPU -Descending | Select-Object -First $receivedValue  | Out-GridView

 

Deploy this Management Pack to the SCOM Server and test it. The following screenshot shows the expected result:

Console-ListTopNProcesses

Console-ListTopNProcesses.gif

Reading:

If you are new to management pack authoring I suggest the free training material from Brian Wren. – It was made for SCOM 2012 R2 but is still valid until today (current SCOM 1807 in year 2018).

On Microsoft’s Virtual Academy: System Center 2012 R2 Operations Manager Management Pack

On Microsoft’s Wiki: System Center Management Pack Authoring Guide

Closing:

If you have questions, comments or feedback feel free to feedback in our SCOM – Gitter – Community on: https://gitter.im/SCOM-Community/Lobby

The Visual Studio solution file is downloadable here:

SCOM Console Powershell Task

Thanks!

Ruben Zimmermann

 

Know more about Ruben and get in touch with him here:

Ruben Zimmermann (A fantastic person who has a lot of great ideas) [Interview]

SCOM User Session Duration – Powershell

A few days ago, I needed to find out how many users are connecting to SCOM daily/weekly and how long was each user connected. Out-of-the-box SCOM does not provide you a way of doing this. So I started looking around for some hints. I came across this article, which looked pretty convincing.

https://blogs.technet.microsoft.com/dirkbri/2014/10/15/an-approach-of-collecting-and-analyzing-scom-2012-sdk-connections/

This looked all good, except I wanted to do it with Powershell.

Whenever any “client” connects/disconnects to/from the console an event is written in the Operations Manager event log. In each event there is event data which gives you information on the following points:

  • EventID
  • ManagementServer Name
  • Username
  • SessionID
  • UUID – extracted from SessionID
  • ID – extracted from SessionID
  • TimeCreated
  • SessionDuration -calculated Time between log on and log off Event
  • SessionCount – calculated cumulated counter of all current sessions

So, I started working on a script that’d meet my requirements. To be honest, I am still a Powershell student, so I got stuck halfway. So I decided to “get-help” (Powershell pun, get it? ;)) from my friend and guide Stoyan Chalakov. I explained to him my idea and asked whether he would help me script it. Sure enough, I wasn’t disappointed. He liked the idea as well, and came up with a nice script that outputs the data the way we imagined. I’ve linked the link to his script on Technet at the end.

Some important remarks which apply to both methods (the one described in the blog and the script):

“Yes, there are design flaws in this approach:

  1. If you create daily files, you will have an unknown amount of already open sessions at the beginning of the day and a certain amount of not yet closed sessions at the end. So your SessionCount will be lower than the cumulated values from the “Client Connections” performance counters of all Management Servers.
    But when you analyse e.g. weekly data instead of daily, you should get very good results.
  2. The tricky part is , the exact same event is written when a Powershell SDK connection is made. You cannot distinguish between PowerShell SDK connections and Operations console connections. Unfortunately this is by design and I am not aware of a way to mitigate this problem.”

The script parses all the events, which are being logged in regards to the particular user and gathers information in regards to:

  • Open session, where the session has not been closed (presence of the Login event, absence of Logout event)
  • Closed session and their duration (presence of the Login event and Logout event)
  • Closed session without and Login session event, indicating that the Login session event has been overridden and cannot be found.
  • The script should be run directly on a management server. If you need you can enable PSremoting and run this on a multiple management server.
  • The script is in its first version, where you need to enter a domain and username. The second version (currently being worked on) will include a full report on all user session and their duration, without the need of specifying a particular user name.

Sample output of the script:

user session

You can download the script here:

Stoyan has commented thoroughly throughout the script, so there would no issues to understand the way it works.

SCOM User Session Script

Thanks a ton Stoyan!

Cheers!

 

 

 

 

 

 

 

 

 

Management Server Frequently Greying Out?

I have seen this issue happening a number of times now. The cause of this can be a few things going wrong, but as part of the troubleshooting I’ve noticed a way that works almost every time, if it applies.

Problem :

All of the sudden, the management server(s) greys out. You check the services, all the services are running. Still for good measure, you restart the services – but no use. You then also try flushing the health state folder cache on the affected MS. And sure, the MS becomes healthy again.

But again after some time you notice that the MS has greyed out. You repeat the process of flushing the cache, it becomes green, and after some time becomes grey again. This cycle continues.

In the event log you may see several events, but not sure where to start. Now these can be any events that may actually be the cause of the problem, or maybe the consequence of it. That’s why you need to read carefully through each of them and find out what event is exactly the problem and which ones are the consequences.

The event we’re discussing here is one particular event 4502. Now this event ID is logged for a number of different reasons and with different descriptions. The one we’re looking for goes something like this (sample only, your descriptions would change acoordingly):

A module of type "Microsoft.EnterpriseManagement.Mom.Modules.SubscriptionDataSource.InstanceSpaceSubscriptionDataSource" reported an exception System.ArgumentNullException: Value cannot be null.

Parameter name: value

   at System.Collections.CollectionBase.OnValidate(Object value)

   at System.Collections.CollectionBase.System.Collections.IList.Add(Object value)

   at Microsoft.EnterpriseManagement.Mom.Modules.SubscriptionDataSource.HttpRESTClient.PostDataAsync(Byte[] data, Object context)

   at Microsoft.EnterpriseManagement.Mom.Modules.SubscriptionDataSource.SubscriptionDataSource`2.WriteToCloud(List`1 items, DateTime firstTryDateTime)

   at Microsoft.EnterpriseManagement.Mom.Modules.SubscriptionDataSource.SubscriptionDataSource`2.PostAsync(List`1 items, DateTime firstTryDateTime) which was running as part of rule "Microsoft.SystemCenter.CollectInstanceSpace" running for instance "All Management Servers Resource Pool" with id:"{4932D8F0-C8E2-2F4B-288E-3ED98A340B9F}" in management group "MG".

These events may come in conjuncture with several others, but I like to fix this one first, as it solves the problem most of the times.

Analysis : 

The event might seem cryptic at first, especially if you aren’t used to troubleshooting, but it provides a valuable piece of information. Note the last line of the description. It says,

which was running as part of rule "Microsoft.SystemCenter.CollectInstanceSpace" running for instance "All Management Servers Resource Pool" with id:"{4932D8F0-C8E2-2F4B-288E-3ED98A340B9F}" in management group "MG".

Here, you get some interesting information, as to which exact rule/monitor is failing, and running for what instance.

Ok, so we have the rule ID and the target. The rule is “Microsoft.SystemCenter.CollectInstanceSpace”. With a quick glance at the System Center Wiki tells me that the display name of this rule is “Send Instancespace to the Cloud” and it is a “System rule that sends instancespace up to the cloud.”

So what happens here is, the rule runs at it’s scheduled interval, and fails. This causes the MS where it’s running on to go grey. When you re-initialize the cache on the MS, everything is reset, and the MS becomes green. Then again, the rule runs at its interval and fails again, the MS goes grey again, and the cycle goes on.

Resolution :

Ok, so now we have some solid information to work on. Grab the rule name, find it in the Rules in your console. Once you do, take a look at the properties. You’d know what is it exactly doing, any overrides, what MP is it coming from, etc.

Now that you’ve found the rule that is the root of the problem, disable it. Now, go back and flush the cache on the MS again. As it is downloading the configuration again, keep an eye on the event log for any errors.

If the MS becomes and remains green, we’re done! If if goes back to grey, follow the process all over again, until you notice there are no more failing workflows from rules/monitors that are causing the MS to go grey.

One step further, if you notice that all these rules/monitors are from the same MP, chances are that MP has been corrupted and you may want to remove or update the MP.

Note that although this might solve your problem, it may not be the only one causing the issue. E.g., bad performance of your databases can also result in this problem. So if you find the problem is still persisting, look for other relevant events that might give you a hint. 🙂

You can refer to these threads from the Technet forums for further reading:

SCOM Health Service greyed out on Management Server

Management server getting greyed out again and again

Hope this helps someone out there with similar issues.

Cheers!

 

Guest Blog – Management Pack Tuning – Ruben Zimmermann

And Ruben is back at it again! This time he has rather interesting topic, that is always hot for a SCOM admin – tuning your management packs! Out-of-the-box, SCOM creates a lot of alerts. I mean A LOT. Truthfully, not every one of those alerts is useful, or relevant to you. If you just let it be like that, you or your support teams would waste a lot of time working on unnecessary alerts instead of focusing on the ones that actually matter. That is why tuning any management pack you import must be tuned to only focus on the things that matter to you and your organization.

That is exactly what Ruben has come up with here. I’m sure this information will be critical for any SCOM admin. Here goes:


SCOM Management Pack tuning step by step

 

Preface

This post explains Management Pack tuning, the reasons why it is required and how it can be performed with the help of free tools and PowerShell.

Monitoring Console showing Alerts

Monitoring Console showing Alerts

Introduction

Every Management Pack contains rules and monitors to indicate a potential issue or expose an already existing problem. Another common type of rules are used to collect performance data.

The Management Pack author or the vendor decide which rules and monitors are enabled by default and which can be enabled via overrides from the SCOM Administrator.

Every environment is different so the judgement which rule or monitor is useful are not the same.

It is the best practice to disable all rules and monitors that don’t bring obvious benefit.
On the other hand, there might be rules and monitors that could be useful for you so you should enable them. The process of doing this is called ‘Management Pack tuning’.

For a few reasons it is important to Management Pack tuning immediately after importing.

  • Alerts that don’t have a noticeable impact just distract SCOM Administrators or Operators.
  • Performance data that is recorded but not analyzed consumes resources on the Database and makes SCOM significantly slower.
  • The more rules and monitors are active on the monitored computer the busier it is handling ‘monitoring’.

A nice side effect is that you’re doing documentation by doing so. It is easy to tell someone what is monitored and what not.

Invite subject matter experts to do the Management Pack tuning with you together.

This gives three direct benefits.

  1. The experts, e.g. DBAs know what is monitored
  2. The experts will tell you what it is needed from their perspective
  3. You, the SCOM Admin can share the responsibility when it comes to the question ‘why did we not know about it before?’ or ‘why wasn’t there an alarm?’

Performing the tuning

As example we will use the Management Pack for the Windows Server 2008.

Note: Usually you only need to care about Management Pack files named monitoring. – Leave those called discovery untouched. Smaller Management Packs might just consist of a single file.

Preparation:

  1. Download the Management Pack Windows Server Operating System and run the setup.
    Keep the default location

    C:\Program Files (x86)\System Center Management Packs\SC Management Pack for Windows Server Operating System
  2. Download MPViewer from https://github.com/JanVanMeirvenne/mpviewer
  3. Copy the PowerShell script from https://gist.github.com/Juanito99/ae7f1ec364ec55bfeb316c3e029d20b2 into PowerShell ISE or VSCode and name it MPTuining-RulesAndUnitMonitors.ps1
    • The script requires PowerShell v3 at minimum. – It is given by Windows Server 2012 by default, for older Windows Server versions please install the current Windows PowerShell version (at the day of writing it is PowerShell 5.1)
  4. Store everything on a Management Server (E.g. C:\Temp\MPTuning).

Handling:

  1. Create a new Override Management Pack for that specific MP and name it properly.
    e.g. ABC.Windows.Server.Overrides

    Administration Pane to create a Management Pack
    Administration Pane to create a Management Pack
    Naming the Management Pack properly
    Naming the Management Pack properly
  2. Launch exe and load the Management Pack named Microsoft.Windows.Server.2008.Monitoring.mp from the default location and choose “Save to Excel”.
    Management Pack Viewer
    Management Pack Viewer
  3. Name the file WindowsServer2008MonitoringExport for instance.
  4. Open Microsoft Excel and open the file, select the Monitors – Unit sheet and hide all columns except of A, D, H and O
  5. In the Ribbon bar select Data and add a Filter. For column D choose only Enabled Monitors. Review and decide if they should be kept enabled. – From my perspective all are useful.
    Excel shet Monitors Unit shwoing filtered columns
    Excel sheet Monitors Unit showing filtered columns
  6. Revert the selection so that Enabled is set to False. Review. I left them also as they are.
  7. Switch to the Rules sheet and limit visible columns to A, C, D, K and O. Afterwards set the filter to show Enabled: True and Category: PerformanceCollection.
    Excel sheet Rules showing filtered columns
    Excel sheet Rules showing filtered columns
  8. Copy rules that seem to be not useful into a text file and name it txt
    WindowsServer2008RulesToDisable

    Text file WindowsServerManagementPack2008_RulesToBeDisable.txt
  9. Note down the name of the Windows Server 2008 Monitoring Management Pack and the Override Management Pack.
    Administration Pane showing Windows Server MP Name
    Administration Pane showing Windows Server MP Name
  10. Navigate to C:\Temp\MPTuning and open the PowerShell script MPTuining-RulesAndUnitMonitors.ps1 (with VSCode for example)
    1. Place the file txt needs to be there, too.
      VSCode running script

      VSCode running script
Parameter Value Meaning
sourceManagementPackDisplayName ‘Windows Server 2009 Operating System (Monitoring)’ Management Pack that contains the rules and unit-monitors we will override
overrideManagementPackDisplayName ‘ABC.Windows.Server.Overrides’ Management Pack we created to store the our configuration changes (overrides)
itemType rule Sets that we will change rules
itemsToBeEnabled False Rules will be disabled
inputFilePath WindowsServerManagementPack2008_RulesToBeDisabled Name of the file that contains the rule names we specfied
  1. Run the PowerShell script by hitting ‘Enter’
  2. After a short while the overrides will appear in the Management Console
    Authoring Pane Showing Overrides
    Authoring Pane Showing Overrides
  3. Repeat the procedure for rules that you like to enable.

If you experience problems or have other questions, come to join our SCOM community at https://gitter.im/SCOM-Community/Lobby

 


Thanks Ruben!

You can know more about Ruben here:
Ruben Zimmermann (A fantastic person who has a lot of great ideas) [Interview]

More from Ruben:

Guest Blog – Authoring a PowerShell Agent Task in XML – By Ruben Zimmermann

Cheers!