Tag: SCOM

Service Uptime Report in SCOM

This is a question that I often get asked by the customers I work with and apparently a lot of others as evident by the related questions on the forums.

One way of doing it is to author your own service monitor, but that involves considerable amount of knowledge and experience of management packs and the underlying coding. It usually takes a lot of time as well. Not everyone has the right knowledge or the time to spend on it so I thought I’d share a quick trick I do to measure uptime of a service and also be able to present it to the concerned parties in the form of a report.

It often happens that you have a service running on your servers and many organizations use it as a “proof” to show that the application was running, or maybe as analysis for troubleshooting, hence it becomes necessary to be able to measure the uptime of the service accurately and to be able to show it to the management and/or to pass it around.

The thing is, when you’re creating a monitor to measure availability of a service in SCOM, you usually choose a “Basic Service Monitor”. This monitor is not very “intelligent” and simply places an instance of itself on every server belonging to the class that you choose. However, you do have another option to monitor your service with, and it is the “Windows Service Template”. This type provides you much more features and finer control on your service monitoring. I wrote a blog earlier comparing these two options of service monitoring and when to use what.

SCOM basic service monitor Vs Windows Service Template

So the way the Windows Service template works is that it creates a discovery of it’s own and hence creates an actual class. This class can further be used to target other workflows that you may have to monitor this class of servers. Another advantage of that is you can now use this class to fetch a “Service Availability” report. For example:

Let’s say I’m monitoring the Spooler service on a bunch of servers, and I need to be able to see the uptime of this service on each of these servers. So I create a Windows Service Template monitor, call it “Test Spooler”.

Now once it’s done, go to “Discovered Inventory” tab under Monitoring. Click on “Change the target type” and you can see that now the “Test Spooler” is a class available for targeting.

This means you can also target your availability report to this target as well and measure the uptime of the service it is monitoring.

You can also fetch a report for a group of service monitors created this way. There’s a good discussion we had a while back regarding this exact requirement:

Service Availability Report

Hope that helps!

Cheers

SCOM 2012 R2 to 1801 Side-by-Side Migration : The Powershell Way! – By Ruben Zimmermann

Ruben is back with another awesome blog post, and I have no doubt this one is going to help a lot of people!

We all know the hassles of migrating your SCOM from one version to another, and it involves a lot of manual efforts. And especially so when you choose the side-by-side upgrade path. Making sure you have all the Management Packs re-imported, all overrides are intact, making sure all permissions to various users you’ve assigned over the years are still in place, etc just to mention a few examples. We’ve all wished how great it would be if you could run some kind of script and it’ll do it all for us. Well, this is exactly what Ruben has made available for us! Here, take a look:

Preface

Migrating from SCOM 2012 R2 to SCOM 1801 isn’t a stressful job as both environments can live in parallel.

This blog gives examples how to migrate the configuration from A to B by trying to leverage PowerShell whenever possible / reasonable.

Please apply the PowerShell code only if you feel comfortable with it. If there are questions, please don’t hesitate to drop me a line.

Introduction

Although it is possible letting the agent do duplicate work in the sense of executing code in management packs, sending data to different management groups can cause excessive resource consumption on the monitored computer.

I suggest:

  • Migrate configuration to 1801 from 2012 R2 (blog topic)
  • Test with a very small amount of machine of different types (Application Servers, Web Servers, MSSQL, Exchange, Domain Controllers)
  • Move the remaining, majority to 1801
  • Change connectors to ticket systems, alert devices other peripheral systems
  • Remove the agent connections from 2012 R2 (see below for details)
  • Monitor the new environment for a while, if all fine decommission 2012 R2

Requirements

  • 1801 is setup fully functional and first agents have been deployed successfully.
  • Windows PowerShell 5.1 is required on the 2012 R2 and on the 1801 server
  • Service Accounts used in 2012 R2 will be re-used. – If they are different in 1801 it is no big obstacle to change them if they are overwritten by one of the import steps.
    • Usually SCOM will alert if there is misconfiguration such a lack of permission somewhere.

Migration

Below now the steps separated in the different parts. – No guarantee for complete-  or correctness.

Management Packs

Migrating to a new environment is always a great chance to perform cleanup and apply the lessons you’ve learned in the old one.

Review

Export all Management Packs from the current SCOM environment and review.

Get-SCOMManagementPack | Select-Object -Property Name, DisplayName, TimeCreated, LastModified, Version | Export-Csv -Path C:\temp\CurrentMps.csv -NoTypeInformation

For your convenience use Excel to open the CSV file. Import only those Management Packs which bring measurable benefit.

Overrides

Standard Approach

If you have followed best practices you have created Override Management Packs for each Management Pack to store your customizations. Export those and import them into your new environment.

Note: Verify that the overrides work as expected.

Green field approach

In case you have stored the all overrides only in one Management Pack or want to be more selective follow these steps

  1. Create a new Override Management Pack for that specific MP and name it properly.
    1. E.g. <YourCompayName>.Windows.Server.Overrides

    2. Follow the steps mentioned in ‘Management Pack tuning’.
      https://anaops.com/2018/06/22/guest-blog-management-pack-tuning-ruben-zimmermann/

Notification Settings (Channels, Subscribers and Subscriptions)

Quoted from: http://realscom.blogspot.de/2017/05/migrate-notifications-to-another.html

  • Export the Microsoft.SystemCenter.Notifications.Internal mp in the old and new management groups – do not modify these, they are our backup copies
  • Make a copy of both files and rename them to something meaningful like Microsoft.SystemCenter.Notifications.Internal.ManagementGroupName_backup.xml
  • Make a note of the MP version number of the Microsoft.SystemCenter.Notifications.Internal MP from the new management group. In my case it was 7.2.11719.0
  • Open up the Microsoft.SystemCenter.Notifications.Internal.xml file for the old management group and change the version number to that of the new management group MP version + 1. In my case I changed it from 7.0.9538.0 to 7.2.11719.1. This is so the MP imports properly in the new management group

Exporting Configuration with PowerShell

Open a PowerShell console on the 2012 R2 Management Server and use the following cmdlets to store the information to C:\Temp\ResolutionStates.json

Alert Resolution State

Only user defined resolution states are exported.

Get-SCOMAlertResolutionState | Where-Object {$_.IsSystem -eq $false} | Select-Object Name, ResolutionState | ConvertTo-Json | Out-File C:\Temp\ResolutionStates.json
Alert Auto – Resolution Settings

Limited to the properties AlertAutoResolveDays and HealthyAlertAutoResolveDays

Get-SCOMAlertResolutionSetting | Select-Object AlertAutoResolveDays, HealthyAlertAutoResolveDays | ConvertTo-Json | Out-File C:\Temp\SCOMExport\AlertResolutionSetting.json
Database Grooming Settings (hint to dwarp! Plus link)

Exports data retention settings for information in the OperationsManager database. The DataWareHouse database is covered later.

Get-SCOMDatabaseGroomingSetting | Select-Object AlertDaysToKeep, AvailabilityHistoryDaysToKeep,EventDaysToKeep,JobStatusDaysToKeep,MaintenanceModeHistoryDaysToKeep,MonitoringJobDaysToKeep,PerformanceDataDaysToKeep,PerformanceSignatureDaysToKeep,StateChangeEventDaysToKeep | ConvertTo-Json | Out-file C:\Temp\SCOMExport\DatabaseGrooming.json
User Roles

Exporting User roles and System roles into dedicated files.

Get-SCOMUserRole | Where-object {$_.IsSystem -eq $false } | Select-Object -Property Name, ProfileDisplayName, Users | ConvertTo-Json | Out-File C:\Temp\SCOMExport\UserRoles.json
Get-SCOMUserRole | Where-object {$_.IsSystem -eq $true }  | Select-Object -Property Name, ProfileDisplayName, Users | ConvertTo-Json | Out-File C:\Temp\SCOMExport\SystemUserRoles.json
Run As Accounts

Exporting user defined RunAsAccounts. System defined ones are created by Management Packs.

Get-SCOMRunAsAccount | Where-Object {$_.isSystem -eq $false} | Select-Object -Property AccountType, UserName, SecureStorageId, Name, Description, SecureDataType | ConvertTo-Json | Out-File C:\Temp\SCOMExport\RunAsAccounts.json
<# The cmdlet 'Get-SCOMRunAsAccount' can't extract credential information. A free cmdlet, written by Stefan Roth is required for that. – Import it to your 2012 R2 environment before proceeding with the following lines.

https://www.powershellgallery.com/packages/RunAsAccount/1.0 #>
Get-SCOMRunAsAccount | Select-Object -ExpandProperty Name | ForEach-Object {

    try {  

        $theName  = $_

        $theTmp  = Get-RunAsCredential -Name $theName

        $thePass = ($theTmp.Password).ToString()

        $thePass




        $secPass  = ConvertTo-SecureString $thePass -AsPlainText -Force

        $theCreds = New-Object -TypeName System.Management.Automation.PSCredential($theName,$secPass)




        $theCreds | Export-Clixml -Path "C:\temp\SCOMExport\$theName.cred"

        $fileName =  "C:\temp\SCOMExport\" + $theName + ".txt"

        "$($thePass)" | Out-File -FilePath $fileName

    } catch {

        $info = 'Swallowing exception'

    }

}

Importing Configuration with PowerShell

Copy the JSON files to the 1801 Management Server (C:\Temp\SCOM-Scripts)  and import using the following code:

Alert Resolution State
$jsonFileContent = Get-Content -Path C:\Temp\SCOM-Scripts\ResolutionStates.json | ConvertFrom-Json



foreach ($aState in $jsonFileContent) { 

Add-SCOMAlertResolutionState -Name $aState.Name -ResolutionStateCode $aState.ResolutionState

}


Alert Resolution Setting
$jsonFileContent = Get-Content -Path C:\Temp\SCOM-Scripts\AlertResolutionSetting.json | ConvertFrom-Json

Set-SCOMAlertResolutionSetting -AlertAutoResolveDays $jsonFileContent.AlertAutoResolveDays -HealthyAlertAutoResolveDays $jsonFileContent.HealthyAlertAutoResolveDays

Database Grooming Settings
$jsonFileContent = Get-Content -Path C:\Temp\SCOM-Scripts\DatabaseGrooming.json | ConvertFrom-Json

Set-SCOMDatabaseGroomingSetting -AlertDaysToKeep $jsonFileContent.AlertDaysToKeep -AvailabilityHistoryDaysToKeep $jsonFileContent.AvailabilityHistoryDaysToKeep -EventDaysToKeep $jsonFileContent.EventDaysToKeep

Set-SCOMDatabaseGroomingSetting -JobStatusDaysToKeep $jsonFileContent.JobStatusDaysToKeep -MaintenanceModeHistoryDaysToKeep $jsonFileContent.MaintenanceModeHistoryDaysToKeep -MonitoringJobDaysToKeep $jsonFileContent.MonitoringJobDaysToKeep

Set-SCOMDatabaseGroomingSetting -PerformanceDataDaysToKeep $jsonFileContent.PerformanceDataDaysToKeep -PerformanceSignatureDaysToKeep $jsonFileContent.PerformanceSignatureDaysToKeep -StateChangeEventDaysToKeep $jsonFileContent.StateChangeEventDaysToKeep        

User Roles
# Importing User roles and System roles from dedicated files. – Review them first.

$jsonFileContent = Get-Content -Path C:\Temp\SCOM-Scripts\UserRoles.json | ConvertFrom-Json

foreach ($uRole in $jsonFileContent) {

    switch ($uRole.ProfileDisplayName) {       

        'Advanced Operator' {

            Add-SCOMUserRole -Name $uRole.Name -DisplayName $uRole.Name -Users $uRole.Users -AdvancedOperator           

        }       

        'Author' {

            Add-SCOMUserRole -Name $uRole.Name -DisplayName $uRole.Name -Users $uRole.Users -Author

        }

        'Operator' {

            Add-SCOMUserRole -Name $uRole.Name -DisplayName $uRole.Name -Users $uRole.Users -Operator

        }

        'Read-Only Operator' {

            Add-SCOMUserRole -Name $uRole.Name -DisplayName $uRole.Name -Users $uRole.Users -ReadOnlyOperator

        }       

    }       

}




$jsonFileContent = Get-Content -Path C:\Temp\SCOM-Scripts\SystemUserRoles.json | ConvertFrom-Json

foreach ($sRole in $jsonFileContent) {

    if ($sRole.Users) {   

        Get-SCOMUserRole | Where-Object {$_.Name -eq $sRole.Name} | Set-SCOMUserRole -User $sRole.Users

    } else {

        continue

    }    

}
Run As Accounts
# Importing User roles and System roles from dedicated files. – Review them first.

$jsonFileContent = Get-Content -Path C:\Temp\SCOM-Scripts\RunAsAccounts.json | ConvertFrom-Json



foreach($rAccount in $jsonFileContent) {




    $theName  = $rAccount.Name   

    if ($theName -match 'Data Warehouse') {

        write-warning "Skipping default account $theName"

        continue

    }

    switch ($rAccount.AccountType) {

        'SCOMCommunityStringSecureData' {                       

            $credFile =  Get-Content -Path "C:\Temp\SCOM-Scripts\creds\$theName.txt"

            $secPass  = ConvertTo-SecureString $credFile -AsPlainText -Force

            Add-SCOMRunAsAccount -CommunityString -Name $theName -Description $rAccount.Description -String $secPass

        }

        'SCXMonitorRunAsAccount' {

            try{               

                $thePass = Get-Content -Path "C:\Temp\SCOM-Scripts\creds\$theName.txt"

                $secPass = ConvertTo-SecureString $thePass -AsPlainText -Force

                $theCreds = New-Object -TypeName System.Management.Automation.PSCredential($theName,$secPass)               

                Add-SCOMRunAsAccount -SCXMonitoring -Name $theName -Description $rAccount.Description -RunAsCredential $theCreds -Sudo              

            } catch {           

                Write-Warning $_                            

            }           

        }

        'SCXMaintenanceRunAsAccount' {

            try{               

                $thePass = Get-Content -Path "C:\Temp\SCOM-Scripts\creds\$theName.txt"

                $secPass = ConvertTo-SecureString $thePass -AsPlainText -Force

                $theCreds = New-Object -TypeName System.Management.Automation.PSCredential($theName,$secPass)               

                Add-SCOMRunAsAccount -SCXMaintenance -Name $theName -Description $rAccount.Description -RunAsCredential $theCreds -Sudo               

            } catch {

                Write-Warning $_                            

            }           

        }

        'SCOMBasicCredentialSecureData' {

            try{              

                $thePass = Get-Content -Path "C:\Temp\SCOM-Scripts\creds\$theName.txt"

                $secPass = ConvertTo-SecureString $thePass -AsPlainText -Force

                $theCreds = New-Object -TypeName System.Management.Automation.PSCredential($theName,$secPass)                

                Add-SCOMRunAsAccount -Basic -Name $theName -Description $rAccount.Description -RunAsCredential $theCreds                

            } catch {

                Write-Warning $_                            

            }           

        }

        'SCOMWindowsCredentialSecureData' {

            try{               

                $thePass = Get-Content -Path "C:\Temp\SCOM-Scripts\creds\$theName.txt"

                $secPass = ConvertTo-SecureString $thePass -AsPlainText -Force

                $theName  = $env:USERDOMAIN + '\' + $rAccount.Name

                $theCreds = New-Object -TypeName System.Management.Automation.PSCredential($theName,$secPass)                

                Add-SCOMRunAsAccount -Windows -Name $theName -Description $rAccount.Description -RunAsCredential $theCreds                

            } catch {

                Write-Warning $_                            

            }           

        }

        default {

            Write-Warning "Not coverd: $rAccount.AccountType please add manually."

        }

    }

}
Run As Profiles

Run As Profiles usually come with Management Packs and therefore not handled here.

Note:

Run As Profiles are the binding component between Run As Account and the Object (e.g. a Computer / Health State) they are used. Please check the configuration in the 2012 R2 environment to setup the mapping and distribution manually.

Datawarehouse Retention Settings

Datawarehouse retention settings are configured with a cmdline tool named dwdatarp.exe. It was released in 2008 and works since then.

Download link can be found here: https://blogs.technet.microsoft.com/momteam/2008/05/13/data-warehouse-data-retention-policy-dwdatarp-exe/

Checking the current settings

After downloading, copy the unpacked executable to your SCOM – database server (e.g. C:\Temp).

Open an elevated command prompt and use the following call to export the current configuration:

dwdatarp.exe -s OldSCOMDBSrvName -d OperationsManagerDW > c:\dwoutput.txt

Setting new values

The following values are just a suggestion to reduce the amount or required space.

dwdatarp.exe -s newscomdbsrv -d OperationsManagerDW -ds "Alert data set" -a "Raw data" -m 180
dwdatarp.exe -s newscomdbsrv -d OperationsManagerDW -ds "Client Monitoring data set" -a "Raw data" -m 30
dwdatarp.exe -s newscomdbsrv -d OperationsManagerDW -ds "Client Monitoring data set" -a "Daily aggregations" -m 90
dwdatarp.exe -s newscomdbsrv -d OperationsManagerDW -ds "Configuration dataset" -a "Raw data" -m 90
dwdatarp.exe -s newscomdbsrv -d OperationsManagerDW -ds "Event data set" -a "Raw Data" -m 30
dwdatarp.exe -s newscomdbsrv -d OperationsManagerDW -ds "Performance data set" -a "Raw data" -m 15
dwdatarp.exe -s newscomdbsrv -d OperationsManagerDW -ds "Performance data set" -a "Hourly aggregations" -m 30
dwdatarp.exe -s newscomdbsrv -d OperationsManagerDW -ds "Performance data set" -a "Daily aggregations" -m 90
dwdatarp.exe -s newscomdbsrv -d OperationsManagerDW -ds "State data set" -a "Raw data" -m 15
dwdatarp.exe -s newscomdbsrv -d OperationsManagerDW -ds "State data set" -a "Hourly aggregations" -m 30
dwdatarp.exe -s newscomdbsrv -d OperationsManagerDW -ds "State data set" -a "Daily aggregations" -m 90

Agent Migration – Remove the older Management Group

After the computer appears as fully managed computer in the SCOM console, remove the older Management group.

Jimmy Harper has authored a Management Pack which helps to clean-up the agent migration after the new agent has been deployed. Please read through:

https://blogs.technet.microsoft.com/jimmyharper/2016/08/13/scom-task-to-add-or-remove-management-groups-on-agents/

This is great! Thanks a lot Ruben, for taking the time and efforts to put this all together!

Get to know more about Ruben here!

Cheers!

Linux/UNIX agent failover and Resource Pools – Debunking the Myth! – Part 2

We discussed about the common misconceptions regarding failover for Windows agents in SCOM in the previous post (part 1). In this part, as promised, Stoyan Chalakov is going to be discussing about the failover for Linux machines monitored by SCOM. Here goes:

Linux/UNIX agent failover and Resource Pools – debunking the myth – Part 2

While answering questions on the Microsoft Technet Forums, I noticed that are lots of questions on topics, which are repeating, or which often are not that well understood. So I had a discussion with Sam (Sameer Mhaisekar) and we decided that it would be very beneficial if we write short blog posts on those topics.

One such topic is agent failover in SCOM and the difference between a Windows and Linux/UNIX agent. In order to understand how the Linux/UNIX agent failover works, one need to first become acquainted with the concept of Resource Pools in Operations Manager.

Sam already wrote the first part on this subject, where he explains the basic of Resource Pools and gives also example of how the failover of the Windows agent work in SCOM. He also referenced 3 very important articles about Resource Pools in SCOM, which are you need to read before getting to Linux/UNIX agent failover.

Before jumping to failover, it is important to mention some important facts about the architecture of Linux/UNIX agent.

The Linux/UNIX agent is very different from the Windows one and hasn’t been changed since Operations Manager 2012. One of the most important functional difference, compared to a Windows agent is the absence of a Health Service implementation. So, all the monitored data is passed to the Health Service on a management server, where all the management pack workflows are being run. This makes it a passive agent, which is being queried (using the WSMan protocol, Port 1270) for availability and performance data by the management servers in the Resource Pool.

The first important thing that needs mentioning is that you cannot discover and monitor UNIX/Linux systems without configuring a Resource Pool first. When you start the SCOM discovery wizard you will notice that you cannot continue unless you have selected a Resource Pool from the drop-down menu.

Here a few important considerations regarding the Resource Pool that will manage the UNIX/Linux agents:

  • It is recommended and, in my opinion, very important to dedicate the management servers in the cross platform Resource Pool only to UNIX/Linux monitoring. The reason for this are the capacity limits, which Operations Manager has when it comes to monitoring UNIX\Linux and Windows and which needs to be calculated very accurately. I will try to explain this in detail. A dedicated management server can handle up to 3000 Windows agents, but only 1000 Linux or UNIX computers. We already revealed the reason for that – cross platform workflows are being run on the management server and this costs performance.

So, if you have also Windows agents, reporting to the dedicated management server, capacity and scalability calculations cannot be made precise and the performance of the management server can be jeopardized.

This fully applies and is a must for larger organizations where there are many Linux or UNIX computers (hundreds of systems) and their number grows. In smaller monitored environments, where you have a small (tens of systems), (almost) static number of cross platform agents, very often, dedicating management servers only for those systems can be an overkill. So, in such cases, I often use management servers, which are not dedicated and are members of the Default Resource Pools or are managing Windows agents. This of course, is only possible if the number of Windows agents is way below the capacity limits of the management group, which would leave enough system Resources on the management server for running the Linux/UNIX workflows.

  • It is very important to dedicate the management server in the cross platform Resource Pool only to UNIX/Linux monitoring. This means not only that you should not assign Windows agents to report to it, but also the server should be excluded also from the other Resource Pools in the management group (SCOM Default Resource Pools, network monitoring Resource Pools, etc.). The reason is the same – performance. If the management server participates in other Resource Pools, it will execute also other types of workflows.

To exclude the management server from the Default Resource Pools, you will need to modify their membership from automatic to manual. By default, each management server, added to the management group is automatically added to the Resource Pools that have an automatic membership type. For some of the Resource Pools this can be accomplished over the console, for others like the “All Management Servers Resource Pool” this can be done only with PowerShell:

Get-SCOMResourcePool -DisplayName “All Management Servers Resource Pool” | Set-SCOMResourcePool -EnableAutomaticMembership 0

  • When you do the capacity planning for your management group, make sure you don’t forget to calculate the number of UNIX or Linux computers a management server can handle in the case another member of the Resource Pool fails. Let me explain this with an example:

According to the official documentation (see the link above), a dedicated management server can handle up to 1000 Linux or UNIX computers. But, if you have two dedicated management servers in your cross platform Resource Pool and you aim for high availability, you cannot assign 2000 (2x 1000) agents to the Pool. Why? Just imagine what will happen with a management server if its “buddy” from the same Resource Pool fails and all its agents get reassigned to the one, which is still operational. You guessed right – it will be quickly overwhelmed by all the agents and become non-operational. So, the right thing to do if you would like your cross platform monitoring to be highly available, is to have 2 management for not more than 1000 agents, so that in case of failure the remaining server can still handle the performance load.

  • Last, but not least, make sure the management servers, which will be used for Linux/UNIX monitoring are sized (RAM, CPUs, Disk space) according to the Microsoft recommendations.

Now back to the agent failover topic…I think it got already pretty clear how the Linux/UNIX agent failover happens behind the scenes, but short summary won’t do any harm:

  • After the discovery wizard is started, a Resource Pool must be selected for managing the systems.
  • When the Resource Pool is selected, it assigns one of the participating management server to complete the actual discovery of the systems and take over the monitoring.
  • When the management server fails, the Resource Pool selects one of its other members to take over the monitoring.

Here a reference to what we said in the beginning that the UNIX/Linux agent is passive and is being queried by the management server. Because of this, it is not actually aware of what happens in the background and continues to communicate with the server, which has been now assigned to it.

Now is also the right time to make a couple of very import notes:

  • XPlat (cross platform) certificates

Part of the preparation of the environment for the monitoring of cross platform systems is the creation of self-signed certificate and its deployment to every management server, member of the Resource Pool. This will ensure that in case of failover each management server will be able to communicate with agent, using the same certificate.

  • High availability with Operations Manager Gateways as members of the Resource Pool (thanks to Graham Davies for the reminder)

What I forgot to mention in the first version of this post, but is of high importance for maintaining high availability of your Gateway Resource Pools (Resource Pool, consisting of Operations Manager Gateway servers) is the fact that two Gateways are not sufficient for achieving high availability. Why? You will find the answer in the article Kevin Holman wrote about Resource Pools in SCOM and how exactly they provide high availability This is also the same article Sam posted in the first part and it is must read if you have to plan for and manage Resource Pools and cross platform agent failover in Operations Manager:

Understanding SCOM Resource Pools

https://kevinholman.com/2016/11/21/understanding-scom-resource-pools/

Conclusion

Understanding UNIX/Linux agent high availability and failover is not a hard thing to do. Still, in order to properly plan for Operations Manager cross platform monitoring, there are some additional things like sizing and scalability that need to be considered.

Awesome, thanks a lot Stoyan! It was very informative and interesting read, as usual! 🙂

You can get in touch with Stoyan on LinkedIn or Twitter, or visit his Technet profile.

Cheers!

Autogrowth on SCOM Operational DB?

This is another of the hot topics I find with differences in opinion among the experts.

The other one we discussed was Windows Agents and Failover – Debunking the Myth!

Should you enable autogrowth on SCOM Operational Database?

I did some some research online and consulted some of the best SCOM experts I know and put together an article that explains why you would NOT want to autogrow your SCOM DB.

The short version is:

DO NOT autogrow your SCOM Operational DB, unless you absolutely need to. Autogrowing DB comes with its own set of disadvantages and might affect the performance of the DB.

So, choose the size of your DB very carefully while you are designing your Management Group!

The longer and more detailed version is here:

Should You Enable Autogrowth on SCOM Operations Database?

Cheers!

PS. Special thanks to Stoyan Chalakov and “SCOM Bob” Cornelissen for reviewing the article and suggesting edits! 🙂

Cyril Azoulay – An expert behind the mask! – [Interview]

I am a regular visitor/contributor to the SCOM Technet Forums and regularly meet many experts from around the globe there. Fortunately for me I eventually became friends/acquaintances with many of them. One of them is Cyril! If you have posted a question in the forums, you may very well have received the solution from a “CyrAz”, that’s him! 😉

I was able to convince Cyril to feature in one of our community experts interview series, and from his answers you can easily guess the knowledge and experience Cyril possesses! Enough chatter from me, I will let Cyril do the talking now:

Q: Hi Cyril! Thanks for agreeing to this. I see you very often on the SCOM Technet forums, and from what I can tell, you really know what you’re talking about! Can you please tell us a little about yourself and your professional journey so far?
A: Thanks for having me 🙂
I’m a system consultant and I almost exclusively work on Microsoft technologies. I work on quite a lot of them (Active Directory, PKI, Exchange, Direct Access, HyperV/Storage Space direct and even Azure Stack these days) but my current field of expertise is System Center, and more specifically SCOM and Orchestrator.
I started working on these two technologies around 6 years ago when my boss wanted me to specialize on something and asked me if I was interested in System Center. I said yes, because I was interested in the whole “everything is integrated together” aspect of the suite, being able to manage every aspect of the datacenter etc. I quickly realized this ideal world only existed in Microsoft’s presentations and that most of these products were actually not designed to run together and were even pretty clunky by themselves, but I still really liked what they allowed to do!
So I went from customer to customer, doing things that were more and more complicated, from daily operator tasks to architecture design and complex MP development; until now where I believe I can call myself an “expert”, even though I’m still learning things every day. I’ve only started using Visual Studio for MP developments a little bit over a year ago!
And I can say that blogs and Technet forums helped me a lot during this journey, it’s really nice to have such a knowledgeable and sharing  community!
Q: What is your opinion about Microsoft’s strategy of 2 different models for System Center products? What do you prefer, the LTSC or SAC model?

A: I’m glad Microsoft is still developing SCOM, and I can’t wait to get the most recent improvements so I definitely prefer SAC. But to be honest, I’m a bit disappointed by what was added in 1801 and 1807… I’m convinced SCOM has great foundations, the Management Pack system is incredibly powerful and flexible, but there is so much that could be done to improve the final product!

Now from my customers point of view, it really depends on how much they can afford in maintaining their environments… if they have a dedicated team that has enough knowledge, they can go for SAC. If they rely exclusively on me coming from time to time to check that everything is running properly, that may not be the best idea.

Q: How do you feel about the whole SCOM vs. OMS argument? Will OMS be a viable “replacement” for SCOM?
A: I must say I really like Log Analytics and what it is becoming, especially since Kusto query language was released. I was already a huge fan of competing products such as Splunk, and that pretty much sums up what I think about the SCOM vs Log Analytics argument : it has no reason to exist since they are not competitors!
Log Analytics is becoming a great tool for… well, ingesting, archiving and analyzing logs of all kind.
But for now it is a terrible replacement for SCOM, regardless of what Microsoft is trying to make us believe! It doesn’t provide any solution to replace the huge list of existing management packs and everything they include, it doesn’t provide an easy solution to develop your own monitoring, it has terrible SLA regarding data ingestion latency…
However, I can very well see how they can work together to provide more information on your environment, and how Log Analytics can become a good monitoring tool someday. But definitely not today, and probably not tomorrow neither.
Q: Given a chance, what would you like to ask or advise to Microsoft?
A: Don’t abandon great tools that can still provide tremendous service to many people just because you believe they are not the future! And maybe try to make them a little bit more usable by non experts…
Q: Lastly, the traditional question! Star Wars or Star Trek?
A: A 4-cheese pizza and a cold beer, thanks 🙂
Awesome! Thanks again for sharing your expertise with us Cyril, and I hope you will continue to help the folks on the forums! 😀
You can also get in touch with him on LinkedIn:
You can also visit his blog (in French) here:
Cheers!
Sam

Authoring PowerShell SCOM Console Tasks in XML – Ruben Zimmermann

Summary:

This post describes how to create PowerShell SCOM Console Tasks in XML along three examples.

Console-ListTopNProcesses

Introduction:

Console Tasks are executed on the SCOM Management Server. Three examples show how to create them using Visual Studio.

  • Task 1: Displaying a Management Server’s Last-Boot-Time [DisplayLastBootTime]
    • Executes a PowerShell script which displays the result in a MessageBox
  • Task 2: Show all new alerts related to selected Computer [ShowNewAlerts]
    • Passes the Computer principal name property (aka FQDN) to the script which then uses a GridView to display the alerts
  • Task 3: Listing the top N processes running on a Management Server [ListTopNProcesses]
    • An InputBox let the user specify the number (N) of top process on the Management server which is retrieved by script and shown via GridView

Requirements:

This blog assumes that you have created already a management pack before. If not, or if you feel difficulties please visit the section ‘Reading’ and go through the links.

The used software is Visual Studio 2017 (2013 and 2015 should work as well) plus the Visual Studio Authoring Extension. Additionally, the ‘PowerShell Tools for Visual Studio 2017’ are installed.

The Community Edition of Visual Studio technically works. – Please check the license terms in advance. – If you are working for a ‘normal company’ you will most likely require Visual Studio Professional.

Realization:

Initial steps in Visual Studio

à Create a new project based on Management Pack / Operations Manager 2012 R2 Management Pack template.

à Name it SCOM.Custom.ConsoleTasks

à Create a folder named Health Model and a sub folder of it named Tasks

à Add an Empty Management Pack Fragment to the root and name it Project.mpx.

Project File Project.mpx Content:

<ManagementPackFragment SchemaVersion="2.0" xmlns:xsd="http://www.w3.org/2001/XMLSchema">

  <LanguagePacks>

    <LanguagePack ID="ENU" IsDefault="true">

      <DisplayStrings>

        <DisplayString ElementID="SCOM.Custom.ConsoleTasks">

          <Name>SCOM Custom ConsoleTasks</Name>

        </DisplayString>

      </DisplayStrings>

    </LanguagePack>

  </LanguagePacks>

</ManagementPackFragment>

Your screen should look like this now:

VSAE-ProjectFile

VSAE-ProjectFile.gif

Creating DisplayLastBootTime task

Within the Tasks folder create a class file named ConsoleTasks.mpx and remove all XML insight the file.

ConsoleTasks.mpx firstly only contains the code for the task DisplayLastBootTime and will be successively extended with the other two tasks.

 

Content of ConsoleTasks.mpx :

<ManagementPackFragment SchemaVersion="2.0" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <Categories>

    <Category ID ="SCOM.Custom.ConsoleTasks.DisplayLastBootTime.ConsoleTaskCategory"  Target="SCOM.Custom.ConsoleTasks.DisplayLastBootTime.ConsoleTask" Value ="System!System.Internal.ManagementPack.ConsoleTasks.MonitoringObject"/>

  </Categories>
  <Presentation>

    <ConsoleTasks>      

      <ConsoleTask ID="SCOM.Custom.ConsoleTasks.DisplayLastBootTime.ConsoleTask" Accessibility="Public" Enabled="true" Target="SC!Microsoft.SystemCenter.RootManagementServer" RequireOutput="false">        <Assembly>SCOM.Custom.ConsoleTasks.DisplayLastBootTime.Assembly</Assembly>

        <Handler>ShellHandler</Handler>

        <Parameters>

          <Argument Name="WorkingDirectory" />

          <Argument Name="Application">powershell.exe</Argument>

          <Argument><![CDATA[-noprofile -Command "& { $IncludeFileContent/Health Model/Tasks/DisplayManagementServerLastBootTime.ps1$ }"]]></Argument>          

        </Parameters>

      </ConsoleTask>    

    </ConsoleTasks>

  </Presentation>
  <LanguagePacks>

    <LanguagePack ID="ENU" IsDefault="true">      

      <DisplayStrings>        

        <DisplayString ElementID="SCOM.Custom.ConsoleTasks.DisplayLastBootTime.ConsoleTask">

          <Name>Custom Console-Tasks: Display LastBootTime</Name>

          <Description></Description>

        </DisplayString>        

      </DisplayStrings>

    </LanguagePack>

  </LanguagePacks>
  <Resources>

    <Assembly ID ="SCOM.Custom.ConsoleTasks.DisplayLastBootTime.Assembly" Accessibility="Public" FileName ="SCOM.Custom.ConsoleTasks.DisplayLastBootTime.Assembly.File" HasNullStream ="true" QualifiedName ="SCOM.Custom.ConsoleTasks.DisplayLastBootTime.Assembly" />

  </Resources>

</ManagementPackFragment>

 

Key points explanation for ConsoleTasks.mpx

Categories

  • Category Value specifies that this element is a console task

Presentation

ConsoleTask Target defines against what this task is launched. In this case it’s the Management Server. You can only see the task if you click on a View that is targeting the corresponding class. E.g.:

Console-ManagementServer-Tasks

Console-ManagementServer-Tasks.gif
  • Parameters, Argument Name “Application” sets PowerShell.exe to be called for execution
  • Parameters, Argument <![CDATA … defines the file within this Visual Studio project that contains the PowerShell code to be processed when the task is launched ( not handled yet ).

LanguagePack

  • DisplayString maps the Console Task ID to a text that is more user friendly. This will be shown in the SCOM console

Within the Tasks folder create a PowerShell file named DisplayManagementServerLastBootTime.ps1

Content of  DisplayManagementServerLastBootTime.ps1 :
$regPat         = '[0-9]{8}'

$bootInfo       = wmic os get lastbootuptime

$bootDateNumber = Select-String -InputObject $bootInfo -Pattern $regPat | Select-Object -ExpandProperty Matches | Select-Object -ExpandProperty Value

$bootDate       = ([DateTime]::ParseExact($bootDateNumber,'yyyyMMdd',[Globalization.CultureInfo]::InvariantCulture))

$lastBootTime   = $bootDate | Get-Date -Format 'yyyy-MM-dd'




$null = [System.Reflection.Assembly]::LoadWithPartialName('Microsoft.VisualBasic')

$null = [Microsoft.VisualBasic.Interaction]::MsgBox($lastBootTime,0,"""Last Boot time of $($env:COMPUTERNAME)""")

 

Deploy this Management Pack to the SCOM Server and test it. The following screenshot shows the expected result:
Console-DisplayLastBootTime

Console-DisplayLastBootTime.gif

Creating ShowNewAlerts task

Keep the content of ConsoleTasks.mpx unchanged. Add the new code into the proper places.

Additional Content for ConsoleTasks.mpx for ShowNewAlerts:

<Category ID ="SCOM.Custom.ConsoleTasks.ShowNewAlerts.ConsoleTaskCategory"  Target="SCOM.Custom.ConsoleTasks.ShowNewAlerts.ConsoleTask" Value ="System!System.Internal.ManagementPack.ConsoleTasks.MonitoringObject"/><ConsoleTask ID="SCOM.Custom.ConsoleTasks.ShowNewAlerts.ConsoleTask" Accessibility="Public" Enabled="true" Target="Windows!Microsoft.Windows.Computer" RequireOutput="false">

        <Assembly>SCOM.Custom.ConsoleTasks.ShowNewAlerts.Assembly</Assembly>

        <Handler>ShellHandler</Handler>

        <Parameters>

          <Argument Name="WorkingDirectory" />

          <Argument Name="Application">powershell.exe</Argument>

          <Argument><![CDATA[-noprofile -noexit -Command "& { $IncludeFileContent/Health Model/Tasks/ShowNewAlertsForThisComputer.ps1$ }"]]></Argument>          <Argument>"$Target/Property[Type='Windows!Microsoft.Windows.Computer']/PrincipalName$"</Argument>

        </Parameters>

      </ConsoleTask>
<DisplayString ElementID="SCOM.Custom.ConsoleTasks.ShowNewAlerts.ConsoleTask">

          <Name>Custom Console-Tasks: Display ShowNewAlertsForThisComputer</Name>

          <Description></Description>

        </DisplayString>

<Assembly ID ="SCOM.Custom.ConsoleTasks.ShowNewAlerts.Assembly" Accessibility="Public" FileName ="SCOM.Custom.ConsoleTasks.ShowNewAlerts.Assembly.File" HasNullStream ="true" QualifiedName ="SCOM.Custom.ConsoleTasks.ShowNewAlerts.Assembly" />

Key points explanation for ConsoleTasks.mpx for ShowNewAlerts

Categories

  • Category Value specifies that this element is a console task

Presentation

  • ConsoleTask Target defines against what this task is launched. In this case it’s the Windows Computer. – This task is visible when in a View that shows Computer objects.
  • Parameters, Argument Name “Application” sets PowerShell.exe to be called for execution
  • Parameters, Argument <![CDATA … defines the file within this Visual Studio project that contains the PowerShell code to be processed when the task is launched ( not handled yet ).
  • Parameters, Argument “$Target/Property […]/PrincipalName$” gets the FQDN (principal name attribute) of the selected computer and makes it available for retrieving it in the script.

 

Content of ShowNewAlertsForThisComputer.ps1 :
param($ComputerName)

$allNewAlerts = Get-SCOMAlert | Select-Object -Property Name, Description, MonitoringObjectDisplayName, IsMonitorAlert, ResolutionState, Severity, PrincipalName, TimeRaised  | Where-Object {$_.PrincipalName -eq $ComputerName -and $_.ResolutionState -eq '0'}




if ($allNewAlerts) {

       $allNewAlerts | Out-GridView

} else {

       Write-Host 'No new alerts available for the computer: ' + $ComputerName

}

 

Deploy this Management Pack to the SCOM Server and test it. The following screenshot shows the expected result:

Console-ShowNewAlerts

Console-ShowNewAlerts

Creating ListTopNProcesses task

Keep the content of ConsoleTasks.mpx unchanged. Add the new code into the proper places.

Additional Content for ConsoleTasks.mpx for ListTopNProcesses:

<Category ID ="SCOM.Custom.ConsoleTasks.ListTopNProcesses.ConsoleTaskCategory"  Target="SCOM.Custom.ConsoleTasks.ListTopNProcesses.ConsoleTask" Value ="System!System.Internal.ManagementPack.ConsoleTasks.MonitoringObject"/>  <ConsoleTask ID="SCOM.Custom.ConsoleTasks.ListTopNProcesses.ConsoleTask" Accessibility="Public" Enabled="true" Target="SC!Microsoft.SystemCenter.RootManagementServer" RequireOutput="false">

        <Assembly>SCOM.Custom.ConsoleTasks.ListTopNProcesses.Assembly</Assembly>

        <Handler>ShellHandler</Handler>

        <Parameters>

          <Argument Name="WorkingDirectory" />

          <Argument Name="Application">powershell.exe</Argument>

          <Argument><![CDATA[-noprofile -noexit -Command "& { $IncludeFileContent/Health Model/Tasks/ListTopNProcesses.ps1$ }"]]></Argument>

        </Parameters>

      </ConsoleTask>

<DisplayString ElementID="SCOM.Custom.ConsoleTasks.ListTopNProcesses.ConsoleTask">

          <Name>Custom Console-Tasks: Display ListTopNProcesses</Name>

          <Description></Description>

        </DisplayString>     

<Assembly ID ="SCOM.Custom.ConsoleTasks.ListTopNProcesses.Assembly" Accessibility="Public" FileName ="SCOM.Custom.ConsoleTasks.ListTopNProcesses.Assembly.File" HasNullStream ="true" QualifiedName ="SCOM.Custom.ConsoleTasks.ListTopNProcesses.Assembly" />

Key points explanation for ConsoleTasks.mpx for ListTopNProcesses

Categories

  • Category Value specifies that this element is a console task

Presentation

  • ConsoleTask Target defines against what this task is launched. In this case it’s the Windows Computer. – This task is visible when in a View that shows Computer objects.
  • Parameters, Argument Name “Application” sets PowerShell.exe to be called for execution
  • Parameters, Argument <![CDATA … defines the file within this Visual Studio project that contains the PowerShell code to be processed when the task is launched ( not handled yet ).

 

Content of ListTopNProcesses.ps1 :
$null          = [System.Reflection.Assembly]::LoadWithPartialName('Microsoft.VisualBasic')

$receivedValue = [Microsoft.VisualBasic.Interaction]::InputBox("""Enter the number of top processes to be shown""", """List top processes:""", "10")

Get-Process | Sort-Object -Property CPU -Descending | Select-Object -First $receivedValue  | Out-GridView

 

Deploy this Management Pack to the SCOM Server and test it. The following screenshot shows the expected result:

Console-ListTopNProcesses

Console-ListTopNProcesses.gif

Reading:

If you are new to management pack authoring I suggest the free training material from Brian Wren. – It was made for SCOM 2012 R2 but is still valid until today (current SCOM 1807 in year 2018).

On Microsoft’s Virtual Academy: System Center 2012 R2 Operations Manager Management Pack

On Microsoft’s Wiki: System Center Management Pack Authoring Guide

Closing:

If you have questions, comments or feedback feel free to feedback in our SCOM – Gitter – Community on: https://gitter.im/SCOM-Community/Lobby

The Visual Studio solution file is downloadable here:

SCOM Console Powershell Task

Thanks!

Ruben Zimmermann

 

Know more about Ruben and get in touch with him here:

Ruben Zimmermann (A fantastic person who has a lot of great ideas) [Interview]

SCOM User Session Duration – Powershell

A few days ago, I needed to find out how many users are connecting to SCOM daily/weekly and how long was each user connected. Out-of-the-box SCOM does not provide you a way of doing this. So I started looking around for some hints. I came across this article, which looked pretty convincing.

https://blogs.technet.microsoft.com/dirkbri/2014/10/15/an-approach-of-collecting-and-analyzing-scom-2012-sdk-connections/

This looked all good, except I wanted to do it with Powershell.

Whenever any “client” connects/disconnects to/from the console an event is written in the Operations Manager event log. In each event there is event data which gives you information on the following points:

  • EventID
  • ManagementServer Name
  • Username
  • SessionID
  • UUID – extracted from SessionID
  • ID – extracted from SessionID
  • TimeCreated
  • SessionDuration -calculated Time between log on and log off Event
  • SessionCount – calculated cumulated counter of all current sessions

So, I started working on a script that’d meet my requirements. To be honest, I am still a Powershell student, so I got stuck halfway. So I decided to “get-help” (Powershell pun, get it? ;)) from my friend and guide Stoyan Chalakov. I explained to him my idea and asked whether he would help me script it. Sure enough, I wasn’t disappointed. He liked the idea as well, and came up with a nice script that outputs the data the way we imagined. I’ve linked the link to his script on Technet at the end.

Some important remarks which apply to both methods (the one described in the blog and the script):

“Yes, there are design flaws in this approach:

  1. If you create daily files, you will have an unknown amount of already open sessions at the beginning of the day and a certain amount of not yet closed sessions at the end. So your SessionCount will be lower than the cumulated values from the “Client Connections” performance counters of all Management Servers.
    But when you analyse e.g. weekly data instead of daily, you should get very good results.
  2. The tricky part is , the exact same event is written when a Powershell SDK connection is made. You cannot distinguish between PowerShell SDK connections and Operations console connections. Unfortunately this is by design and I am not aware of a way to mitigate this problem.”

The script parses all the events, which are being logged in regards to the particular user and gathers information in regards to:

  • Open session, where the session has not been closed (presence of the Login event, absence of Logout event)
  • Closed session and their duration (presence of the Login event and Logout event)
  • Closed session without and Login session event, indicating that the Login session event has been overridden and cannot be found.
  • The script should be run directly on a management server. If you need you can enable PSremoting and run this on a multiple management server.
  • The script is in its first version, where you need to enter a domain and username. The second version (currently being worked on) will include a full report on all user session and their duration, without the need of specifying a particular user name.

Sample output of the script:

user session

You can download the script here:

Stoyan has commented thoroughly throughout the script, so there would no issues to understand the way it works.

SCOM User Session Script

Thanks a ton Stoyan!

Cheers!

 

 

 

 

 

 

 

 

 

Management Server Frequently Greying Out?

I have seen this issue happening a number of times now. The cause of this can be a few things going wrong, but as part of the troubleshooting I’ve noticed a way that works almost every time, if it applies.

Problem :

All of the sudden, the management server(s) greys out. You check the services, all the services are running. Still for good measure, you restart the services – but no use. You then also try flushing the health state folder cache on the affected MS. And sure, the MS becomes healthy again.

But again after some time you notice that the MS has greyed out. You repeat the process of flushing the cache, it becomes green, and after some time becomes grey again. This cycle continues.

In the event log you may see several events, but not sure where to start. Now these can be any events that may actually be the cause of the problem, or maybe the consequence of it. That’s why you need to read carefully through each of them and find out what event is exactly the problem and which ones are the consequences.

The event we’re discussing here is one particular event 4502. Now this event ID is logged for a number of different reasons and with different descriptions. The one we’re looking for goes something like this (sample only, your descriptions would change acoordingly):

A module of type "Microsoft.EnterpriseManagement.Mom.Modules.SubscriptionDataSource.InstanceSpaceSubscriptionDataSource" reported an exception System.ArgumentNullException: Value cannot be null.

Parameter name: value

   at System.Collections.CollectionBase.OnValidate(Object value)

   at System.Collections.CollectionBase.System.Collections.IList.Add(Object value)

   at Microsoft.EnterpriseManagement.Mom.Modules.SubscriptionDataSource.HttpRESTClient.PostDataAsync(Byte[] data, Object context)

   at Microsoft.EnterpriseManagement.Mom.Modules.SubscriptionDataSource.SubscriptionDataSource`2.WriteToCloud(List`1 items, DateTime firstTryDateTime)

   at Microsoft.EnterpriseManagement.Mom.Modules.SubscriptionDataSource.SubscriptionDataSource`2.PostAsync(List`1 items, DateTime firstTryDateTime) which was running as part of rule "Microsoft.SystemCenter.CollectInstanceSpace" running for instance "All Management Servers Resource Pool" with id:"{4932D8F0-C8E2-2F4B-288E-3ED98A340B9F}" in management group "MG".

These events may come in conjuncture with several others, but I like to fix this one first, as it solves the problem most of the times.

Analysis : 

The event might seem cryptic at first, especially if you aren’t used to troubleshooting, but it provides a valuable piece of information. Note the last line of the description. It says,

which was running as part of rule "Microsoft.SystemCenter.CollectInstanceSpace" running for instance "All Management Servers Resource Pool" with id:"{4932D8F0-C8E2-2F4B-288E-3ED98A340B9F}" in management group "MG".

Here, you get some interesting information, as to which exact rule/monitor is failing, and running for what instance.

Ok, so we have the rule ID and the target. The rule is “Microsoft.SystemCenter.CollectInstanceSpace”. With a quick glance at the System Center Wiki tells me that the display name of this rule is “Send Instancespace to the Cloud” and it is a “System rule that sends instancespace up to the cloud.”

So what happens here is, the rule runs at it’s scheduled interval, and fails. This causes the MS where it’s running on to go grey. When you re-initialize the cache on the MS, everything is reset, and the MS becomes green. Then again, the rule runs at its interval and fails again, the MS goes grey again, and the cycle goes on.

Resolution :

Ok, so now we have some solid information to work on. Grab the rule name, find it in the Rules in your console. Once you do, take a look at the properties. You’d know what is it exactly doing, any overrides, what MP is it coming from, etc.

Now that you’ve found the rule that is the root of the problem, disable it. Now, go back and flush the cache on the MS again. As it is downloading the configuration again, keep an eye on the event log for any errors.

If the MS becomes and remains green, we’re done! If if goes back to grey, follow the process all over again, until you notice there are no more failing workflows from rules/monitors that are causing the MS to go grey.

One step further, if you notice that all these rules/monitors are from the same MP, chances are that MP has been corrupted and you may want to remove or update the MP.

Note that although this might solve your problem, it may not be the only one causing the issue. E.g., bad performance of your databases can also result in this problem. So if you find the problem is still persisting, look for other relevant events that might give you a hint. 🙂

You can refer to these threads from the Technet forums for further reading:

SCOM Health Service greyed out on Management Server

Management server getting greyed out again and again

Hope this helps someone out there with similar issues.

Cheers!

 

Guest Blog – Authoring a PowerShell Agent Task in XML – By Ruben Zimmermann

It is my absolute pleasure to announce this guest blog today. My good friend Ruben has come up with a fantastic community MP that is going to make all of our lives a lot easier 🙂

I will let him talk about it himself. All yours, Ruben! –


Authoring a PowerShell Agent Task in XML

Summary:

This post describes the code behind a PowerShell Agent Task. With the help of a real-life case ‘Create Log Deletion Job’ the lines in XML and PowerShell are commented.

Introduction:

Create Log Deletion Job is a SCOM – Agent Task which offers the creation of a scheduled task that deletes log files older than N days on the monitored computer. It works on SCOM 2012 R2 and later.

Requirements:

This blog assumes that you have created already management packs before. If not, or if you feel difficulties please visit the section ‘Reading’ and go through the links.

The used software is Visual Studio 2017 (2013 and 2015 work as well) plus the Visual Studio Authoring Extension. Additionally the ‘PowerShell Tools for Visual Studio 2017’ are installed.

The Community Edition of Visual Studio technically works. – Please check the license terms in advance. – If you are working for a ‘normal company’ you will most likely require Visual Studio Professional.

Realization:

The agent task mainly consists of two components.

  1. A PowerShell Script which creates a Scheduled Task and writes another script on the target machine with the parameters specified in the SCOM console.
  2. A custom module that contains a custom write action based on PowerShell Write Action and a task targeting Windows Computer leveraging the write action.
  3. Visual Studio Authoring Extension (VSAE), a free Plugin for Visual Studio is used to bind the XML and PowerShell together and produce a Management Pack.
    The Management Pack itself can be downloaded as Visual Studio solution or as compiled version directly from GitHub.

https://github.com/Juanito99/Windows.Computer.AgentTasks.CreateLogDeletionJob

Steps to do in Visual Studio:

  • Create a new project based on Management Pack / Operations Manager R2 Management Pack.- Name it ‘Windows.Computer.AgentTasks.CreateLogDeletionjob’
  • Create a folder named ‘Health Model’ and a sub folder of it named ‘Tasks’.
  • Add an Empty Management Pack Fragment to the root and name it Project.mpx.

Project File ‘project.mpx’ Content:

<ManagementPackFragment SchemaVersion="2.0" xmlns:xsd="http://www.w3.org/2001/XMLSchema">

  <LanguagePacks>




    <LanguagePack ID="ENU" IsDefault="true">

      <DisplayStrings>

        <DisplayString ElementID="Windows.Computer.AgentTasks.CreateLogDeletionJob">

          <Name>Windows Computer AgentTasks CreateLogDeletionJob</Name>

          <Description>Creates a scheduled task on the managed computer which automatically deletes old logs.</Description>

        </DisplayString>

      </DisplayStrings>

    </LanguagePack>

  </LanguagePacks>




</ManagementPackFragment>

The Powershell Script:

Create a file named CreateLogDeletionJob.ps1 in the ‘Tasks'<h3> sub folder and copy the following content inside it.

param($LogFileDirectory,$LogFileType,$DaysToKeepLogs,$ScheduledTasksFolder)

$api = New-Object -ComObject 'MOM.ScriptAPI'

$api.LogScriptEvent('CreateLogDeletionJob.ps1',4000,4,"Script runs. Parameters: LogFileDirectory $($LogFileDirectory), LogFileType: $($LogFileType) DaysToKeepLogs $($DaysToKeepLogs) and scheduled task folder $($scheduledTasksFolder)")           

Write-Verbose -Message "CreateLogDeletionJob.ps1 with these parameters: LogFileDirectory $($LogFileDirectory), LogFileType: $($LogFileType) DaysToKeepLogs $($DaysToKeepLogs) and scheduled task folder $($scheduledTasksFolder)"

$ComputerName          = $env:COMPUTERNAME

$LogFileDirectoryClean = $LogFileDirectory      -Replace('\\','-')

$LogFileDirectoryClean = $LogFileDirectoryClean -Replace(':','')

$scheduledTasksFolder  = $scheduledTasksFolder -replace([char]34,'')

$scheduledTasksFolder  = $scheduledTasksFolder -replace("`"",'')

$taskName              = "Auto-Log-Dir-Cleaner_for_$($LogFileDirectoryClean)_on_$($ComputerName)"

$taskName              = $taskName -replace '\s',''

$scriptFileName        = $taskName + '.ps1'

$scriptPath            = Join-Path -Path $scheduledTasksFolder -ChildPath $scriptFileName

if ($DaysToKeepLogs -notMatch '\d' -or $DaysToKeepLogs -gt 0) {

                $daysToKeepLogs = 7

                $msg = 'Script warning. DayToKeepLogs not defined or not matching a number. Defaulting to 7 Days.'

                $api.LogScriptEvent('CreateLogDeletionJob.ps1',4000,2,$msg)

}

if ($scheduledTasksFolder -eq $null) {

                $scheduledTasksFolder = 'C:\ScheduledTasks'

} else {

                $msg = 'Script warning. ScheduledTasksFolder not defined. Defaulting to C:\ScheduledTasks'

                $api.LogScriptEvent('CreateLogDeletionJob.ps1',4000,2,$msg)

                Write-Warning -Message $msg

}

if ($LogFileDirectory -match 'TheLogFileDirectory') {

                $msg =  'CreateLogDeletionJobs.ps1 - Script Error. LogFileDirectory not defined. Script ends.'

                $api.LogScriptEvent('CreateLogDeletionJob.ps1',4000,1,$msg)

                Write-Warning -Message $msg

                Exit

}

if ($LogFileType -match '\?\?\?') {    

                $msg = 'Script Error. LogFileType not defined. Script ends.'

                $api.LogScriptEvent('CreateLogDeletionJob.ps1',4000,1,$msg)

                Write-Warning -Message $msg

                Exit

}

Function Write-LogDirCleanScript {




                param(

                                [string]$scheduledTasksFolder,

                                [string]$LogFileDirectory,                 

                                [int]$DaysToKeepLogs,                      

                                [string]$LogFileType,

                                [string]$scriptPath

                )

               

                if (Test-Path -Path $scheduledTasksFolder) {

                                $foo = 'folder exists, no action requried'

                } else {

                                & mkdir $scheduledTasksFolder

                }

               

                if (Test-Path -Path $LogFileDirectory) {

                                $foo = 'folder exists, no action requried'

                } else {

                                $msg = "Script function (Write-LogDirCleanScript, scriptPath: $($scriptPath)) failed. LogFileDirectory not found $($LogFileDirectory)"

                                Write-Warning -Message $msg

                                $api.LogScriptEvent('CreateLogDeletionJob.ps1',4001,1,$msg)                

                                Exit

                }




                if ($LogFileType -notMatch '\*\.[a-zA-Z0-9]{3,}') {

                                $LogFileType = '*.' + $LogFileType

                                if ($LogFileType -notMatch '\*\.[a-zA-Z0-9]{3,}') {

                                                $msg = "Script function (Write-LogDirCleanScript, scriptPath: $($scriptPath)) failed. LogFileType: $($LogFileType) seems to be not correct."

                                                Write-Warning -Message $msg

                                                $api.LogScriptEvent('CreateLogDeletionJob.ps1',4001,1,$msg)                

                                                Exit

                                }

                }




$fileContent = @"

Get-ChildItem -Path `"${LogFileDirectory}`" -Include ${LogFileType} -ErrorAction SilentlyContinue | Where-Object { ((Get-Date) - `$_.LastWriteTime).days -gt ${DaysToKeepLogs} } | Remove-Item -Force

"@         

               

                $fileContent | Set-Content -Path $scriptPath -Force

               

                if ($error) {

                                $msg = "Script function (Write-LogDirCleanScript, scriptPath: $($scriptPath)) failed. $($error)"                       

                                $api.LogScriptEvent('CreateLogDeletionJob.ps1',4001,1,$msg)

                                Write-Warning -Message $msg

                } else {

                                $msg = "Script: $($scriptPath) successfully created"   

                                Write-Verbose -Message $msg

                }




} #End Function Write-LogDirCleanScript







Function Invoke-ScheduledTaskCreation {




                param(

                                [string]$ComputerName,                   

                                [string]$taskName

                )                

                               

                $taskRunFile         = "C:\WINDOWS\system32\WindowsPowerShell\v1.0\powershell.exe -NoLogo -NonInteractive -File $($scriptPath)"    

                $taskStartTimeOffset = Get-Random -Minimum 1 -Maximum 10

                $taskStartTime       = (Get-Date).AddMinutes($taskStartTimeOffset) | Get-date -Format 'HH:mm'                                                                                                                                                                                     

                $taskSchedule        = 'DAILY'             

                & SCHTASKS /Create /SC $($taskSchedule) /RU `"NT AUTHORITY\SYSTEM`" /TN $($taskName) /TR $($taskRunFile) /ST $($taskStartTime)                

                               

                if ($error) {

                                $msg = "Sript function (Invoke-ScheduledTaskCreation) Failure during task creation! $($error)"

                                $api.LogScriptEvent('CreateLogDeletionJob.ps1',4002,1,$msg)                

                                Write-Warning -Message $msg

                } else {

                                $msg = "Scheduled Tasks: $($taskName) successfully created"

                                Write-Verbose -Message $msg

                }              




} #End Function Invoke-ScheduledTaskCreation







$logDirCleanScriptParams   = @{

                'scheduledTasksFolder' = $ScheduledTasksFolder

                'LogFileDirectory'     = $LogFileDirectory       

                'daysToKeepLogs'       = $DaysToKeepLogs     

                'LogFileType'          = $LogFileType

                'scriptPath'           = $scriptPath

}




Write-LogDirCleanScript @logDirCleanScriptParams







$taskCreationParams = @{

                'ComputerName'  = $ComputerName             

                'taskName'      = $taskName

                'scriptPath'    = $scriptPath

}




Invoke-ScheduledTaskCreation @taskCreationParams

The Custom Module:

Add an Empty Management Pack Fragment in the ‘Tasks’ sub folder name it ‘AgentTasks.mpx’. Copy the content below into it.

<ManagementPackFragment SchemaVersion="2.0" xmlns:xsd="http://www.w3.org/2001/XMLSchema">




  <TypeDefinitions>   

    <ModuleTypes>

      <!-- Defining the Write Action Module Type. The name is user-defined and will be used in the task definition below. -->

      <WriteActionModuleType ID="Windows.Computer.AgentTasks.CreateLogDeletionJob.WriteAction" Accessibility="Internal" Batching="false">

        <!-- The items in the Configuration sections are exposed through the SCOM console -->

        <Configuration>

          <xsd:element minOccurs="1" name="LogFileDirectory" type="xsd:string" xmlns:xsd="http://www.w3.org/2001/XMLSchema" />

          <xsd:element minOccurs="1" name="LogFileType" type="xsd:string" xmlns:xsd="http://www.w3.org/2001/XMLSchema" />

          <xsd:element minOccurs="1" name="ScheduledTasksFolder" type="xsd:string" xmlns:xsd="http://www.w3.org/2001/XMLSchema" />

          <xsd:element minOccurs="1" name="DaysToKeepLogs" type="xsd:integer" xmlns:xsd="http://www.w3.org/2001/XMLSchema" />

          <xsd:element minOccurs="1" name="TimeoutSeconds" type="xsd:integer" xmlns:xsd="http://www.w3.org/2001/XMLSchema" />

        </Configuration>

        <!-- To make exposed items editable it is required to specify them in the OverrideableParameters section -->

        <OverrideableParameters>

          <OverrideableParameter ID="LogFileDirectory" Selector="$Config/LogFileDirectory$" ParameterType="string" />

          <OverrideableParameter ID="LogFileType" Selector="$Config/LogFileType$" ParameterType="string" />

          <OverrideableParameter ID="ScheduledTasksFolder" Selector="$Config/ScheduledTasksFolder$" ParameterType="string" />

          <OverrideableParameter ID="DaysToKeepLogs" Selector="$Config/DaysToKeepLogs$" ParameterType="int" />         

        </OverrideableParameters>

        <ModuleImplementation Isolation="Any">

          <Composite>

            <MemberModules>

              <!-- The ID name is user-defined, suggested to keep it similar as the type. The type signals SCOM to initiate the PowerShell engine  -->

              <WriteAction ID="PowerShellWriteAction" TypeID="Windows!Microsoft.Windows.PowerShellWriteAction">

                <!-- To keep the module readable and the script better editable it is referenced below -->

                <ScriptName>CreateLogDeletionJob.ps1</ScriptName>

                <ScriptBody>$IncludeFileContent/Health Model/Tasks/CreateLogDeletionJob.ps1$</ScriptBody>

                <!-- The parameters are the same than in the Configuration section as we want to pass them into the script from the SCOM console -->

                <Parameters>

                  <Parameter>

                    <Name>LogFileDirectory</Name>

                    <Value>$Config/LogFileDirectory$</Value>

                  </Parameter>

                  <Parameter>

                    <Name>LogFileType</Name>

                    <Value>$Config/LogFileType$</Value>

                  </Parameter>

                  <Parameter>

                    <Name>ScheduledTasksFolder</Name>

                    <Value>$Config/ScheduledTasksFolder$</Value>

                  </Parameter>

                  <Parameter>

                    <Name>DaysToKeepLogs</Name>

                    <Value>$Config/DaysToKeepLogs$</Value>

                  </Parameter>

                </Parameters>

                <TimeoutSeconds>$Config/TimeoutSeconds$</TimeoutSeconds>

              </WriteAction>

            </MemberModules>

            <Composition>

              <Node ID="PowerShellWriteAction" />

            </Composition>

          </Composite>

        </ModuleImplementation>

        <OutputType>System!System.BaseData</OutputType>

        <InputType>System!System.BaseData</InputType>

      </WriteActionModuleType>     

    </ModuleTypes>

  </TypeDefinitions>




  <Monitoring>   

    <Tasks>     

      <!-- The name for the task ID is user-defined. Target is set to 'Windows.Computer' that make this task visible when objects of type Windows.Computer are shown in the console -->

      <Task ID="Windows.Computer.AgentTasks.CreateLogDeletionJob.Task" Accessibility="Public" Enabled="true" Target="Windows!Microsoft.Windows.Computer" Timeout="120" Remotable="true">

        <Category>Custom</Category>

        <!-- The name for the write action is user-defined. The TypeID though must match to the above created one. -->

        <WriteAction ID="PowerShellWriteAction" TypeID="Windows.Computer.AgentTasks.CreateLogDeletionJob.WriteAction">

          <!-- Below the parameters are pre-filled to instruct the users how the values in the overrides are exptected-->

          <LogFileDirectory>C:\TheLogFileDirectory</LogFileDirectory>

          <LogFileType>*.???</LogFileType>

          <ScheduledTasksFolder>C:\ScheduledTasks</ScheduledTasksFolder>

          <DaysToKeepLogs>7</DaysToKeepLogs>         

          <TimeoutSeconds>60</TimeoutSeconds>

        </WriteAction>

      </Task>     

    </Tasks>

  </Monitoring>




  <LanguagePacks>

    <LanguagePack ID="ENU" IsDefault="true">

      <DisplayStrings>       

        <DisplayString ElementID="Windows.Computer.AgentTasks.CreateLogDeletionJob.Task">

          <Name>Create Log Deletion Job</Name>

        </DisplayString>     

      </DisplayStrings>     

      <KnowledgeArticles></KnowledgeArticles>

    </LanguagePack>

  </LanguagePacks>




</ManagementPackFragment>

Reading:

If you are new to management pack authoring I suggest the free training material from Brian Wren.

On Microsoft’s Virtual Academy: https://mva.microsoft.com/en-US/training-courses/system-center-2012-r2-operations-manager-management-pack-8829?l=lSKqLpx2_7404984382

On Microsoft’s Wiki: https://social.technet.microsoft.com/wiki/contents/articles/15251.system-center-management-pack-authoring-guide.aspx

Closing:

If you have questions, comments or feedback feel free to feedback in our SCOM – Gitter – Community on: https://gitter.im/SCOM-Community/Lobby


Awesome! Thanks a lot for all your contribution and for being considerate for others by publishing it for everyone, Ruben 🙂 Wishing to have many more cool guest blogs from you!

You can get to know Ruben better here:

Ruben Zimmermann (A fantastic person who has a lot of great ideas) [Interview]

Cheers!

SCOM Event Based Monitoring – Part 2 – Rules

In the last post we discussed about event based monitoring options SCOM  provides with Monitors. You can find it here:

SCOM Event Based Monitoring – Part 1 – Monitors

In this post we are going to discuss the event based monitoring options using SCOM Rules. Basically the highlighted options in the image below:

1

As we can see, we have 2 kinds of rules for monitoring events. “Alert Generating Rules” and “Collection Rules”. Let’s walk through them one by one.

Alert Generating Rules:

As the name suggests, this type of rules raise an alert if they detect the event ID.

As you’re going through the Rule Creation wizard you will notice that there are several options in “Rule category”, as shown in the pic below. In my experience, there is no difference whatever you choose. These are more like for “Logical Grouping” of these rules, that you can maybe make use of if working on them through Powershell. Since we’re here to create an “alert generating” rule, the most obvious option here would be “Alert”.

1

Now, this step is pretty important and if you’re new to creating this, you’re very likely to miss this. As you reach the final window, on the bottom of “Alert Description” (which you can modify btw), is the box for “Alert Suppression”. This is used to tell SCOM what events should be considered “duplicates” and hence not raise a new alert if there’s already one open in active alerts. “Auto-suppression” happens automatically for monitors – it only increases the repeat count for every new detection – but for rules, you’ll have to do this manually. If you don’t do this, you’re gonna get a new alert every time this event is detected. In this demo, I’m telling SCOM to consider the event as duplicate if it has the same Event ID AND the same Event Source.

I HIGHLY recommend using this since I learned this hard way some time back. I missed out configuring alert suppression for some rule and a couple nights later, woke up to a call from our Command Center guys. They said, “Dude! We’ve received THOUSANDS of  emails in our mailbox since last 15 minutes…and they’re all the same. Help!”

I rushed and turned it off, further investigation brought to light the cause that something had gone wrong on the server and it was writing the same event over and over again, thousands of times in the event log, and since I had not configured the suppression criteria, it created a new alert every time and sent mail as set up in subscription. Now I check for suppression criteria at least 3 times 🙂

1

Now that is set up, I created the event in the logs with a simple PoSh:

Write-EventLog -Logname Application -ID 1000 -Source custom -Message "This is a test message"

And as you can see, the alert was raised.

1

Collection Rules:

Collection rules are used only to collect the events in the SCOM database. They do not create any alert. In fact, you’ll hardly even notice their presence at all. Why create these rules then? Most commonly for reporting/auditing purposes. You can also create a dashboard showing the detection of these events.

1

1

Notice in the left side of the wizard that there’s no “Configure Alerts” tab as we had in the “Alert Generating” rule.

So what this rule is going to do is detect the occurrence of event 500 with source “custom” and if it detects it, just saves it in the database.

I wrote this event in the logs a bunch of times, now let’s see if SCOM is really collecting them or not. We’ll create a simple “Event View” dashboard for this.

1

And yup, we can indeed see that the event is actually being collected by SCOM.

1

Here’s another thing that might be a bit tricky. After you create these 2 types of events, if you open the properties, you’ll see they’re almost identical. If you’ve named them properly, kudos to you but if you (or someone else) hasn’t, how will you find out whether the rule is generating alerts or just collecting events? Take a close look at the content in the yellow boxes below. See the difference?

Alert Generating Rule Properties:

1

Event Collection Rule Properties:

1

You rightly noticed that the response for the alert generating rule is “Alert”, which means this rule is going to respond to the event detection by generating an alert. If you click on the “Edit” option just beside that, you’ll see wizard for alert name change, format change, suppression criteria, etc.

On the other hand, the yellow box in the collection rule is “Event Data Collection Write Action” and “Event data publisher”, which indicates that it is only writing it in the databases. You can also verify this with Powershell:

1

You can also fetch a report for the event collection rule for auditing purposes, but unfortunately I do not have the Reporting feature installed in my lab so can’t show a demo for that. 🙂

Thus concluding the event monitoring options SCOM offers with monitors as well as Rules. Now you know what abilities SCOM has, now it’s up to you to decide which option do you wanna choose 🙂

Cheers!