Demo Contact

AI Ops: Proactive Problem Management

Did you know that Alemba Service Manager (ASM) can synchronize your real-time IT dashboards with AI Ops to provide advanced Proactive Problem Management?

ASM allows you to visualize the pattern of current Incidents in your system. AI Ops supports two methods of pattern recognition:

Absolute Count

AI Ops will count the number of Incidents that match the set criteria within a given period of time. If the threshold is exceeded, then AI Ops will automatically create a Problem record and link the matching Incidents as children. The problem record and a notification will be sent to the Problem Manager.

Change of Behavior Threshold

A recurring time period is configured and the number of Incidents in the current period is compared with the number in the previous period. Should the change between the periods exceed a given percentage, then a Problem record and notification will be sent to the Problem Manager with all the Incidents in the current period linked as children.

Example AI Ops logic diagram

Example: The number of matching Incidents recorded in time period A is compared with the number of Incidents recorded in time period B, and if the change exceeds the set threshold (X%), then a Problem record is logged with all the Incidents recorded in time period B attached as children.

Using Dashboards to Support Problem Management

The powerful dashboard engine in ASM ships with a set of useful reports that support the problem manager in their analysis of Incidents and events being logged in the system.

Information such as Incidents per Service, per Configuration Item, per Location and per Department can highlight infrastructural or training issues.

The effect of newly released Changes into the infrastructure can be monitored by dashboards that focus on the CI or services changed and the users of those services looking for post-Change spikes.

Screenshot of reporting Dashboard

Example: Powerful Problem Management dashboards help Problem Managers identify trends and discover root causes.

How to get started: Configuring AI Ops

This section will define how to configure AI Ops for Problem Management. You must have AI Ops in your IPK Security role to configure AI Ops.

You can schedule “AI Ops rules” which will run and analyze your Call and Request activity. Each rule has a “threshold” that is, a particular number of events within a running period, and a set of conditions which, when met, will automatically trigger a new call/request in ASM.

Configuring AI Ops Rules

Using the AI Ops wizard, you will be guided through the steps required to configure a rule.

If a step or option is mandatory it is flagged with an asterisk (*). Once a step is complete it is marked with a green tick, except the Welcome and Summary steps.

Step 1: Instructions (Optional)

This step provides an overview of the steps for setting up your AI Ops. You can clear ‘Show Instructions’ at startup to skip this step on creating new AI Ops rules or editing existing rules.

Step 2: Title and Description

Provide a name (title) and description for your rule. This is a mandatory field. * It is good practice to use a meaningful name to help identify the purpose of the rule.

*If a step or option is mandatory it is flagged with an asterisk (*)

Step 3: Data Set

Here you will decide if you want to analyze Incidents or Requests. Additionally, you can decide if the rule runs for a set time span or runs indefinitely.

Step 4: Selection Criteria

Here you will select the parameters in the call that you wish to pattern match on.

Step 5: Grouping Criteria

This step configures the matching rules and thresholds based on the parameters selected in Step 4.

Step 6: Schedule

In this step, schedule the time frame for your analysis, that is, when and how often the rule should run.

Use Case for AI Ops

A Problem Manager suspects instability in the network environment. She can configure an automated AI Ops rule to log a Problem call whenever more than 5 high priority outage calls are logged against critical servers. She can configure the rule to exclude any servers that have a Physical Status of “In Test” or “Training Dedicated”. Finally, she can configure the call to be auto assigned to the problem Management team.

When the AI Ops rule runs and reaches the threshold of 5 high priority outage calls against the critical server, a new Problem is automatically logged by the system and forwarded to the Problem Management team.

Kroger

 

Established in 1883, The Kroger Co. is one of the largest general retailers in the United States.

Liverpool City Council

 

Liverpool City Council is the governing body for the city of Liverpool in Merseyside, England. It consists of 90 councillors, three for each of the city's 30 wards.

Cherwell End Of Life: Transitioning to Alemba Service Manager

22 Jul 2024

The end-of-life announcement for Cherwell Service Management presents significant business implications for organizations currently relying on the Cherwell ITSM tool to administer the delivery of IT Service Management processes.

Emerging Trends in ITSM Webinar: Highlights

04 Mar 2024

Alemba’s recent webinar on Emerging Trends in ITSM provided valuable insights into key industry developments.