Podcast #1: Understanding The IT Incident Management Process

it incident management podcast

Podcast Episode #1: Incident Management - The Daily Blocking and Tackling of an IT Department


Jack: Welcome. This is an episode of Transformative IT Service management, a podcast here at Bell-Tech Logics. We're gonna have a few people come in and talk about service management, some of the different processes that are part of service management and how they might impact you or your business. The first one that we're gonna cover today is the process of incident management. Joining me is Brenda Lichtenberg. Brenda is the Senior Vice President of Strategy and Portfolio. Brenda, welcome.

Brenda: Thank you. I'm excited to be here today.

Jack: We're gonna talk about incident management first so I guess we should set the baseline. What is incident management?

Brenda: Incident Management really is the tracking of an incident of something that is broken when it comes in and then tracking it from beginning to end. That includes identification of prioritization, categorization of the call, capturing the problem and then also there's different aspects of incident management in terms of being able to look at it from a point of view of how do I track the information regarding the incident, reporting and trending as well as also capturing special handling for an end user, Special handling or VIP status. All of those rolled together kind of give you a nice comprehensive overview of what incident management is.

Jack: At that incident, like you said, is something that's broken. It's not necessarily, we'll talk later in another episode about service requests where maybe you're asking for something new, but an incident is you have a piece of technology and it's broken and that incident management is that overall process by which you log that broken issue, you assign it and you send it out to get repaired and then tracking all the data.

Brenda: Exactly. There's a lot of special handling that sometimes goes with that when it's against a certain product and maybe you want an agent to collect certain information about a request. They'll work with the end user to collect that information, record that information and the request. Then maybe assign it to a special support group and that might also include some special notification so you can make sure you expedite that request.

Jack: It sounds like incident management is sort of the daily blocking and tackling of how IT departments really take care of the business. Obviously, it's important to IT departments, why should businesses care about incident management?

Brenda: Really, by tracking your incidents correctly outside, if you will, of IT is that's how you manage your business on a day to day operation is understanding what's happening, what's broken, what needs to be fixed and then addressing those issues in a timely fashion. Since there are different priorities for incident management, that will tell you if I have a priority one or a severity one if you will. That means it's urgent and I need to get it addressed right away. Otherwise there's other issues where you, maybe a normal priority and you have a 24 hour period to resolve that issue, but for each of the different kinds of incidents and priorities that you log, that really is a game-changer in your business. Understanding what needs to be addressed now, re-prioritizing your workload and addressing those issues to get critical things fixed so you do not have impact or cost impact directly to your bottom line.

Jack: Yeah, that incident management process really gives the IT department a lot of information they can share with the business and really optimize everything together. Sounds like it's a little bit different than just incident handling where sometimes businesses may think that resolving an incident is just getting the right person to put in the right USB drive, download the update or replace a widget, but there's a whole process to it.

Brenda: Right. Where you incident handling is just that state of that incident. I have an incident light. I'm going to apply this correction, these patches and then resolve it where incident management is really looking at the holistic process to say did I attach the right knowledge article? Am I trending and reporting against it? Being able to look back on it and say, "How often are we having these type of issues?" And "Should I be creating maybe a knowledge article against it or maybe I need to change an SLA against it because we're seeing this so frequently." It gives you knowledge and transparency into your environment and being able to address those issues.

Jack: Well, you touched on several different service management processes there. I think it's important to know that incident management is typically when a lot of organizations are adopting ITIl or another framework for service management processing. Incident management is often one of the early processes that organizations try to adopt and implement and then really think through the process. I think that's because incident management really does touch a lot of the other service management processes.

Brenda: It sure does.

Jack: From knowledge management to your point, you want to make sure that you have the right information in the hands of your service desk or your desk side or your engineers to resolve incidents as quickly as possible, to problem management which we'll talk about more in our next podcast where you're really looking to eliminate incidents all together as well as major incident management when you have those business stopping issues where everything is grinding to a halt and have the highest level of severity and priority. I think in one of the things that makes all of those processes connected is you kind of start with this core concept of in-service management, you're tracking some foundational data and base information that you can apply and do things with to create a better service environment for the business users.

Brenda: Right. That's one of the things that you mentioned incident management as being foundational. It really is. A lot of times clients start off with that incident management and when you do that, you have to take into consideration where are you gonna go next with incident management. That is when you bring in that foundation data, you want to make sure that you have the right contact information that maybe is direct feed from your active directory. You will also look at how is that set up in terms of for contact information? Do I have a manager in my contact load so that if you do start branching out to service request management as you mentioned earlier, then I can do automatic approvals by that contact record. There's also important information about the foundation data in terms of the location information. Maybe I have to dispatch somebody to that end users site where they are located. It's all about efficiency in that foundation load. That can't be underestimated. You really need to spend some time and collect the right information, make sure it's normalized and make sure that it's accurate and that you have a timely feed so that you know when new users are on board and old users have been off-boarded or need too be off-boarded, that that's taken care of automatically.

Jack: That's a great point. As a CIO or an IT leader, there's a great opportunity to take advantage of the information tracked inside of incident management to really look for areas of efficiency, to look for areas of either potential redundancy or opportunity to really make things run smoother. This connects back to problem management which we'll talk about, but even more broadly than problem management, I've seen CIOs start to take data, incident data and correlate that with more business metrics where they can really understand the business cost of down time if they have a service analytics data warehouse. They're able to say, "Hey when this major incident or when this incident was affecting this group of users, did this impact my productivity? Did this impact my revenue generation? Did it impact my profitability?" As an IT leader, those are some great metrics to really start to talk about RY and help justify plans of where you may want to spend money on the business, on the business of IT, as it were.

On the topic of business and It, one of the things that's on the news everywhere these days is automation. You almost can't turn the news on without hearing a story about robots and AI and automation and how machines are going to transform and change the business landscape. Certainly we've seen that happen a lot in manufacturing, but is there a place here in IT and more specifically in incident management?

Brenda: Yes. A really good time for automation is right now. The technology is such that it really lends itself to, as you were talking about a little bit earlier right, reducing costs, increase efficiencies. When you bring automation into your environment, that provides a lot of great value to your end users as well as to your service desk as well. When you look at, you're talking about trending, we look at the trending very seriously because if you have something that's happening all the time, for instance, like passer resets, that is a key element for saying we need a password reset tool. That can save you up to about 30% in your environment depending on your call volume and that type of thing, Password reset is just one example, but there are other things like intelligent agents.

We will look at the data that's logged and we talked about the importance of foundation data and getting the right information categorizing it. When you look at that data, that will tell you where you have deterministic incidents. That means incidents that we can resolve quickly and easily and an intelligent agent can do that because they are repetitive. That will also save a call to the service desk so an intelligent agent is something that we can use in a self-help portal. That will be something that an end user these days really appreciate. Our user community is really changing dramatically these days where they don't always want to make a call. They want to use a self-help portal that has great features like chat features, intelligent agents, very good knowledge and knowledge that lets you to be able to, for instance, maybe point to a YouTube video where you can resolve an issue by looking at the YouTube versus going a step-by-step-by-step instruction.

Jack: By capturing that data as a part of incident management, that can feed that back end process that's doing the analysis for turning that knowledge management article from a having a potential audience of that service desk technician to that self-service portal.

A lot of what you've described when you're talking about automation is pretty, for lack of a better word, reactionary, right, where the user's seeking out help. Is there a concept with incident management on automation where we're getting in front of the user more proactively?

Brenda: That's something that is in fact getting very popular again. That's something that you want to invest in. That's really things like self-healing. What self-healing is, is that I'm going to put a little agent perhaps on the device or the desktop and that will allow you to understand what's happening on that desk set. It will send errors, notifications. It can even be configured to find slow response time in certain applications depending on how you set it up and configure it. It can be as robust as you would like that to be. When you find these self-healing type of issues that pop up, that will get triggered to an agent or to a service desk that will look at that proactively reach out to you and say, "Hey I see you're seeing an error with your PC maybe in storage or memory and performance." Then they'll work with you to proactively address that issue. It can literally, depending on the issue, it can literally save you between three to five days of being out of productivity for an end user.

Jack: I can see how this connects to potential operations' management where there might even be a place where instead of having a user wait for a server site application to have an issue, that there's some monitoring platform that's watching and looking for potential challenges with that application and maybe even auto-remediates through a script restarting a process.

Brenda: In fact you mentioned a little bit earlier about certain issues in a certain area by a location and really this is an ideal situation. If you're tracking response time performance for say Outlook, how long does it take to open up an email. All of a sudden, you see that all happening in one area of let's say New York. All these notifications are going off. You can go do event management and have that triggered to pull up. Maybe I've got an Outlook server exchange issue in the New York area where all of these users are being impacted. It's a key element for proactively addressing issues before you have a major outage.

Jack: Well, it's a good thing that our first topic for our podcast was incident management cause it seems like instant management really is connected to almost every other service management process that's part of the framework. Thank everyone for tuning in. This has been a episode of Bell-Tech Logics Transformative IT podcast. We've talked about incident management. Tune in next time when we're going to cover problem management and talk a little bit about how it's a little bit different than incident management.

See Our Case Studies

Every story has a beginning. Let us show you how our stories unfold.

Our Videos

Data Driven Enterprise