Search This Blog

Friday, December 6, 2013

Do your intuitions come into play in your IT consulting work?

Got a interesting fact from one of my friend from Bangalore. It goes like this....
During a Trivial Pursuit game with my family years ago, my two younger brothers were on one team. Whenever it was their turn, they would debate the question between themselves, veering off into all sorts of unrelated topics, false memories, and logical fallacy. It was all I could do to keep from laughing as I imagined how far from the truth their answer would be, but time and again, they surprised me by coming up with the correct answers.
In retrospect, I think that one or both of them must have known every correct answer, but they didn’t know how they knew it. All the so-called reasoning that led them to the answer was just a way to chew up some time while their brains worked on fetching it in the background. When the right answer finally surfaced, they recognized it as correct. It was one of the more striking examples of intuition in action that I have ever witnessed.
I had to admire their feat, but I could never imitate it or advise anyone else to attempt to do so. Like most people in my profession, I prefer sound logic and verifiable data when available. Perhaps more importantly, I want to know where I can’t be certain, and to what degree.
Even though “knowledge is power” in business you can’t always afford to wait for knowledge; sometimes you need to make a decision based on what little you know, informed by your past experience. That’s where intuition plays a useful role even in a technical occupation like IT consulting. Some people refer to it as “trusting your gut,” but I prefer to think of it as knowing something without knowing precisely why.
I don’t want to encourage magical thinking, so here’s a concrete example that’s simpler than your average software problem: a card game. In almost every game involving a deck of cards, it benefits your decision making to know what cards the other players have — but that information is usually hidden. As you become a more experienced player, you notice what cards have been played; perhaps you even keep a mental count of the more important cards that you see, and note behaviors that might indicate that your opponent does or does not have a specific card. If you keep playing long enough, you’ll “know” where certain cards are, without any scientific proof. It’s not magic — your brain can perceive a pattern based on your past experience for which you don’t yet possess a theory.
The same thing happens in IT consulting and in other businesses. When you’ve been there many times before, you can smell when something isn’t right — even if you can’t put a name on it. You can also notice an opportunity before you’ve thought of a plan to exploit it.
Thus it seems obvious that the more you know about a subject, the better your intuitions will be. In practice, though, once we’ve invested a lot of energy into a canonical way of thinking about a subject, it’s difficult for us to let any thought outside that discipline receive any attention. We can become too narrowly focused to pick up on other cues. In my experience, the best intuitions come to me in a situation where I have a lot of related experience, but just enough ignorance to be novel.
That makes IT consulting a good career for me. All of my clients share the common problems of software development, but their individual challenges differ enough to keep me on my toes. I still strive for hard data where it can be found, and I work now more than ever on self-education, but where those fall short, I’m learning not to overthink it and to trust my intuitions.
That’s all from my experience in IT.

Wednesday, November 27, 2013

Successful IT Governance


There was a time when I was working in a organization where there was no process in place for DC operation, every other manager was setting up there own processes to help and manage Data center operations. In this scenario bottom line was affected & most of the employee's were confused about their own & organization’s growth, which ultimately leads to mismanagement of vendors, suppliers, partners, & most important customers.
Aha, So I Was wondering about how to setup a Successful IT Governance in highly valued organization like mine was.
So, I came across following 5 aspects. After much devoted thought, I think these might help managers to create a Successful IT Governance in the organization.
a.   Alignment of IT objective with respect to business objective.
b.   Value delivery to drive maximum business value from IT investment.
c.   A handy risk management processes should be in  place to ensure that risks have been managed respect to business objective.
d.   Resource management - high level of ethical direction for outsourcing, insourcing and use of IT resource.
e.   Performance measurement, verify strategic compliance. Achievement of strategic IT objective.

Sunday, October 13, 2013

The new world of risk management for Project Managers

After a long risk management session at PMI Hyderabad club, in which the project managers share with us their try to get the team to consider all reasonable risks and opportunities, most of the PM asked their team after the session, the wrap-up question, “Are there any other risks we haven’t captured?” Some sarcastic team member will inevitably respond with what he considers to be an out-there scenario: “The building could blow up!” or “The next great depression might start!” The point of these remarks is usually to tell the PM to stop taking everyone’s time spinning improbable scenarios of disaster and just let the engineers get on with their work. If the twenty-first century has taught us anything, it’s that these out-there predictions have a nasty habit of occurring. In fact, the incidence of these improbable events, and their obvious impact, has inspired a theory that is described in the best-selling book The Black Swan by Nassim Taleb. The author’s proposition, in short, is that the “human brain is wired for pattern recognition and so it sees patterns and narratives where none exist.” This gives rise to the fallacy that we can accurately predict the future from the past and that events will follow the patterns that we have recently observed (fishbone kind of).  The occurrence of risk events previously thought of as remote as. How did our risk models fail us so completely?, and the ideas expressed in The Black Swan are influencing risk managers in all industries.

Risk management in IT projects

What does this have to do with risk management in a project context? Risk management can become a unnoticeable exercise, in which project teams will plot out a few obvious problem scenarios, such as hardware that arrives dead or software that won’t install properly, and leave it at that. I’ve even seen teams that have a standing risk list and just append it to their scope document, assuming that the risks and the contingency measures we should put in place are the same for every project. Risks are often evaluated in isolation from each other, so that the cascade effects from one negative event are not thought through and managed properly. IT outsourcing, or engaging consultants, is a form of risk transfer; however, many IT consultants and service firms don’t have a knowledge / grasp of risk pricing, so they allow their customers to shift risk to them without getting the appropriate rewards for taking on that risk. So, in this new environment where everyone is more conscious of risk management, how should IT PMs change their approach to risk? Here are a few notable points:
  • Consider the improbable: While it may be unlikely that the next great depression or a collapsing building will impact your next project, we’ve seen that these unlikely events can occur. Don’t allow the sarcasm or confidence of your delivery team to defer you from asking the right questions about catastrophic possibilities and from considering how you’ll mitigate these risks. The building may not fall, but the data center could get flooded or lose power or get an EMP attack, or the delivery truck could get stolen (all scenarios I’ve seen in real time and experienced the failure of projects due to that). Force your team to walk through the less obvious, and sometimes the improbable, and you’ll find that this discussion flushes out some real risks that you must consider.
  • Follow the cascade: Adverse events often have deep cascade effects, In IT project work, I’ve seen this phenomenon when teams don’t plot the intersections of events and requirements and fail to account for the impact that adverse events will have across the entire spectrum of project tasks. The simplest hardware failure or software glitch can cascade across the entire project, but teams are frequently divided into infrastructure, production support, packaged software integration, and development. It’s the role of PMs to ensure that the cross-discipline impacts of risks are identified and mitigated, as the PM is often the only resource on the project looking across all the delivery elements.
  • Beware hidden assumptions: As risk managers has learned, assumptions about what might fail, how it would affect the rest of the project, and what the worst-case scenario might look like are often colored by emotion, ego, rationalization, and self-interest. As any experienced PM can confirm, any IT technician who has had a good run can fall into this category. We’re all subject to the risk of believing in our own competence, and we all have highly tuned rationalization engines that can convince us that things will work out well because they have so far. PMs must challenge these rationalizations and expressions of overconfidence to ensure that emotion is not blocking the team from uncovering the real risks to the project.
  • Get paid for risk: One of the most characteristic errors I see, that  IT service firms and contract PMs make is to allow the client to transfer risk to them without getting paid for it. Too many firms end up doing rework or free work on projects that were inherently risky, but were priced at the same rate as less risky, IT operational type projects. There’s a risk in any engagement, whether you’re upgrading Windows or developing an entirely new software from scratch; many firms will price these efforts at the same billable rate, often without a risk contingency budget, and then end up eating the overages when things don’t go in the right direction. I would suggest that every project estimate includes a risk premium. Presenting the internal or external client with a separate risk premium sends the message that the team has seriously considered the level of risk this project carry. It also prepares the client for the reality that every project will have unknowns and adverse events.
Most of the IT project from 9/11 era has learned in the past few years that our risk calculations must be broader, deeper, than they were previously. In the world of the IT PM, we must take advantage of the painful lessons of other project / risk managers and ensure that we’re thinking about the mitigation of even unlikely events and that we’re considering the cascade effects of every risk we identify.
That’s all form me now.






Saturday, September 21, 2013

What are the data center technologies?

Alternative Energy: Solar, wind and hydro show great potential for generating electricity in an eco-friendly manner. Nuclear and hydro show great potential for grid based, green power. However, the biggest challenge when it comes to using alternative energy for your data center applications is the need for a constant supply at high service levels. If you use alternative energy but still need to buy from the local power company when hit with peak loads, many of the economic benefits youre reaping from the alternative energy source will disappear quickly. As new storage mechanisms are developed that capture and store the excess capacity so it can be accessed when needed, then alternative energy sources will play a much greater role in the data center than they do today. Water and air based storage systems show great potential as eco-friendly energy storage options.
Ambient Return: This is a system whereby air returns to the air conditioner unit naturally and unguided. This method is inefficient in some applications because it is prone to mixing hot and cold air, and to stagnation caused by static pressure, among other problems.
Chiller based cooling: A type of cooling where chilled water is used to dissipate heat in the CRAC unit (rather than glycol or refrigerant). The heat exchanger in a chiller based system can be air or water cooled. Chiller based system provide CRAC units with greater cooling capacity than DX based systems. Besides removing the DX limitation of a 24° F. spread between output and input, the chiller system can adjust dynamically based on load.
Chimney effect: Just as your home chimney leverages air pressure differences to drive exhaust, the same principle can be used in the data center. This has lead to a common design with cool air being fed below a raised floor and pulled into the data center as hot air escapes above through the chimney. This design creates a very efficient circulation of cool air while minimizing air mixing.
Cloud computing: This is a style of computing that is dynamically scalable through virtualized resources provided as a service over the Internet. In this model the customer need not be concerned with the technical details of the remote resources. (That's why it is often depicted as a cloud in system diagrams.) There are many different types of cloud computing options with variations in security, backup, control, compliance and quality of service that must be thoroughly vetted to assure their use does not put the organization at risk.
Cogeneration: This is the use of an engine (typically diesel or natural gas based) to generate electricity and useful heat simultaneously. The heat emitted by the engine in a data center application can be used by an "absorption chiller" (a type of chiller that converts heat energy into cooling) providing cooling benefits in addition to electric power. In addition, excess electricity generated by the system can be sold back to the power grid to defray costs. In practice, the effective ROI of cogeneration is heavily dependent on the spread between the cost of electricity and fuel. The cogeneration alternative will also contribute to substantial increase in CO2 emissions for the facility. This runs counter to the trend toward eco-friendly solutions and will create a liability in Cap and Trade carbon trading.
Colocation: Colocation is one of several business models where your data center facilities are provided by another company. In the colocation option, data centers for multiple organizations can be housed in the same facility sharing common power and cooling infrastructure and facilities management. Colocation differs from a dedicated hosting provider in that the client owns its own IT systems and has greater flexibility in what systems and applications reside in their data center. The lines are blurred between the various outsourcing models with variations in rights, responsibilities and risks. For this reason, when evaluating new facilities it is important to make sure the business terms align properly with your long term needs for the space.
Containers: The idea of a data center in a container is that all the power, cooling, space and connectivity can be provisioned incrementally through self contained building blocks, or standard sized shipping containers. These containers can be placed outside your place of business to expand data center capacity or may be deployed in a warehouse type environment. The primary benefit data center containers provide are that they support rapid deployment and are integrated and tuned to support very high power densities. Containers have been embraced for use in cloud type services by Google and Microsoft. The potential downsides of containers are several. They are expensive (more per useable SF than custom built facilities), tend to be homogeneous (designed for specific brands/models of systems) and are intended for autonomous operation (the container must remain sealed to operate within specifications).
CRAC (Computer Room Air Conditioner): A CRAC is a specialized air conditioner for data center applications that can add moisture back into the air to maintain the proper humidity level required by the electronic systems.
DX cooling (direct expansion): A compressor and glycol/refrigerant based system that uses airflow to dissipate heat. The evaporator is in direct contact with the air stream, so the cooling coil of the airside loop is also the evaporator of the refrigeration loop. The term "direct" refers to the position of the evaporator with respect to the airside loop. Because a DX-based system can reduce the air temperature by a maximum of 23° F, they are much more limited in application when compared to more flexible chiller based systems.
Economizer: As part of a data center cooling system, air economizers expel the hot air generated by the servers/devices outdoors and draw in the relatively cooler outside air (instead of cooling and recirculating the hot air from the servers). Depending on the outdoor temperature, the air conditioning chiller can either be partially or completely bypassed, thereby providing what is referred to as free cooling. Naturally, this method of cooling is most effective in cooler climates.
Fan tile: A raised floor data center tile with powered fans that improve airflow in a specific area. Fan tiles are often used to help remediate hot spots. Hot spots are often the result of a haphazard rack and server layout, or an overburdened or inadequate cooling system. The use of fan tiles may alleviate a hot spot for a period of time, but improved airflow and cooling systems that reduce electricity demands generally are a better option for most facilities.
Floor to Ceiling Height: In modern, high-density data centers, the floor to ceiling height has taken on greater importance in site selection. In order to build a modern, efficient facility, best practices now call for a 36-foot (or more) raised floor plenum to distribute cool air efficiently throughout the facility (with overhead power and cabling). In addition, by leveraging the chimney effect and hot air return, the system can efficiently reject the hot air while introducing a constant flow of cool air to the IT systems. To build a facility upgradeable to 400 watts/SF, you should plan on a floor to ceiling height of at least 18 feet. Some data center designs forego a raised floor and utilize custom airflow ducting and vertical isolation. Since this is a fairly labor intensive process and is tuned to a specific rack layout, it may not be suitable for installations where the floor plan is likely to evolve over the life of the data center.
Flywheel UPS system: A low-friction spinning cylinder that generates power from kinetic energy, and continues to spin when grid power is interrupted. The flywheel provides ride-through electricity to keep servers online until the generators can start up and begin providing power. Flywheels are gaining attention as an eco-friendly and space saving alternative to traditional battery based UPS systems. The downside to flywheel power backup is that the reserve power lasts only 15-45 seconds as compared to a 20 minute window often built into battery backups.
Hot Aisle/Cold Aisle: Mixing hot air (from servers) and cold air (from air conditioning) is one of the biggest contributors to inefficiencies in the data center. It creates hot spots, inconsistent cooling and unnecessary wear and tear on the cooling equipment. A best practice to minimize air mixing is to align the racks so that all equipment exhausts in the same direction. This is achieved simply by designating the aisles between racks as either exclusively hot-air outlets or exclusively cool-air intakes. With this type of deployment, cold air is fed to the front of the racks by the raised floor and then exhausted from the hot aisles overhead.
NOC (Network Operations Center): A service responsible for monitoring a computer network for conditions that may require special attention to avoid a negative impact on performance. Services may include emergency support to remediate Denial-of-Service attacks, loss of connectivity, security issues, etc…
Rack Unit: A rack unit or U (less commonly, RU) is a unit of measure describing the height of equipment intended for mounting in a computer equipment mounting rack. One rack unit is 1.75 inches (44.45 mm) high.
RTU (Rooftop Unit): RTUs allow facilities operators to place data center air conditioning components on the building's roof, thereby conserving raised white space while improving efficiency. In addition, as higher performance systems become available, RTUs can be easily upgraded without affecting IT operations.
Power-density: As servers and storage systems evolve to become ever more powerful and compact, they place a greater strain on the facility to deliver more power, reject more heat and maintain adequate backup power reserves (both battery backup and onsite power generation). When analyzing power-density, it is best to think in terms of Kw/rack and total power, not just watts per square foot (which is a measure of facility capacity).
Power Density Paradox: Organizations with limited data center space often turn to denser equipment to make better use of the space available to them. However, due to the need for additional power, cooling and backup to drive and maintain this denser equipment, an inversion point is reached where the total need for data center space increases rather than falls. This is the power density paradox. The challenge is to balance the density of servers and other equipment with the availability of power, cooling and space in order to gain operating efficiencies and lower net costs.
Raised-floor plenum: This is the area between the data center sub floor and the raised floor tiles. It is typically used to channel pressurized cold air up through floor panels to cool equipment. It has also been used to route network and power cables, but this is not generally recommended for new data center design.
Remote hands: In a hosted or colocation data center environment, remote hands refers to the vendor-supplied, on-site support services for engineering assistance, including the power cycling of IT equipment, visual inspection, cabling and maybe even swap out of systems.
Steam Humidification: Through the natural cooling process of air conditioning, the humidity levels of a data center are reduced, just as you would find in a home or office air conditioning environment. However, due to the constant load of these AC systems, too much moisture is removed from most IT environments and must be reintroduced to maintain proper operating humidity levels for IT equipment. Most CRAC units use a relatively expensive heat/steam generation process to increase humidity. These steam-based systems also increase the outflow temperature from the CRAC unit and decrease its overall cooling effectiveness.
Ultrasonic Humidification: Ultrasonic humidification uses a metal diaphragm vibrating at ultrasonic frequencies and a water source to introduce humidity into the air. Because it does not use heat and steam to create humidity, ultrasonic systems are 95% more energy efficient than the traditional steam-based systems found in most CRAC units. Most environments can easily be converted from steam based to ultrasonic humidification.
UPS (Uninterruptible Power Supply): This is a system that provides backup electricity to IT systems in the event of a power failure until the backup power supply can kick in. UPS systems are traditionally battery and inverter based systems, with some installations taking advantage of flywheel-based technology.
VFD (Variable Frequency Drive): A system for controlling the rotational speed of an alternating current (AC) electric motor by controlling the frequency of the electrical power supplied to the motor. VFDs save energy by allowing the volume of fluid/air to adjust to match system's demands rather than having the motor operating at full capacity only.
Virtualization: As servers have become more and more powerful, they have also (in general) become underutilized. The challenge to IT organizations has been to compartmentalize applications so they can be self contained and autonomous while at the same time sharing compute capacity with other applications on the same device. This is the challenge addressed by virtualization. Virtualization is the creation of a virtual (rather than actual) version of something, such as an operating system, a server, a storage device or network resources. Through virtualization, multiple resources can reside on a single device (thereby addressing the problem of underutilization) and many systems can be managed on an enterprise-wide basis.
Watts per Square Foot: When describing a data center's capacity, watts per square foot is one way to describe the facility's aggregate capacity. For example, a 1,000 square foot facility with 1 MW power and cooling capacity will support an average deployment of 100 watts per square foot across its raised floor. Since some of this space may have CRAC units and hallways, the effective power density supported by the facility may be much greater (up to the 1MW total capacity). Facilities designed for 60 W/SF deployments just a few years ago cannot be upgraded to support the 400 W/SF loads demanded by modern, high density servers.

Sunday, August 11, 2013

Drive-by tasks as a Project? How to prioritize them?

Basically, “drive-by” projects are all those ad-hoc tasks or initiatives that get dropped in your lap at random intervals every day. They suck up time, they suck up resources, they distract project teams, and have the potential to push active projects behind schedule. Often they are emergency projects that have no strategic initiative attached to them, which is critical for organizations that rely on project-based work to keep teams focused on the right projects. Don’t get me wrong. I’m not suggesting that they are of no value. In fact, it’s not always about separating the good projects from the bad projects. It’s usually a matter of choosing the best projects, the projects that will provide the most business value from a list of good potential projects. Unfortunately, when project teams are faced with dealing with a “drive-by” project, all the work to keep teams focused on those initiatives that drive the most business value gets thrown out the window. I believe that’s why the “get’er done” or “drive-by” project is such a problem. They may be worthy, but if they don’t measure up to the “does this provide the most value” test, they ultimately limit an organization’s capacity to work on the things that do. And that negatively impacts productivity—and ultimately profitability. In theory, everyone agrees with this, however, practice is something different. In the heat of the moment, it’s difficult for decision-makers to step back and ask the question, “Will this “drive-by” project provide enough value that someone should drop what he or she is doing to work on it?” Sometimes the answer is definitely yes, but there are times when the answer should be NO. If nobody asks the question, project teams can be chasing around working on projects of minimal value (at least projects that haven’t be vetted to make sure they are the best projects for teams to be working on). Project and portfolio management best practices revolve around the concept of identifying those projects that meet certain criteria, creating a plan, and then executing on the plan. Project management software does a pretty good job of helping do that. However, sometimes we need to ask ourselves, “How does my work management methodology address ‘drive-by’ tasks and projects that come up every day?” It doesn’t have to be a catastrophic failure that causes an organization to falter. Sometimes it’s the accumulated weight of a thousand insignificant inefficiencies that cause the most damage. How does your work management methodology address the “drive-by” project? Even if your software doesn’t, feel free to share what you do to keep your project teams focused on the right projects.





Sunday, July 14, 2013

How to invent Your Personal Brand!

What are you known for? What do people say about you when you leave the room? (They are talking about you, aren't they?) How can you burnish your reputation to win that promotion or land that new client? You've diligently focused on your personal brand for years — but what if you now want to reinvent yourself? It happens all the time. A financial services executive moves into retail. A techie wants to try marketing. A VC wants to jump ship and become a life coach. Your path may make perfect sense to you, but how can you convince others to embrace your new brand — and take you seriously? Here are five steps to reinventing yourself for the business marketplace.
  1. What's Your Destination? First, you need to develop a detailed understanding of where you want to go, and the knowledge and skills necessary to get there. If you've been a techie for the past decade, you may understand every new marketing toy out there, from Facebook to Foursquare. But can you effectively convey that knowledge to a non-technical audience? Learning the skills you need will help you gain the confidence necessary to start identifying (and publicizing) yourself in your new identity.
  2. Leverage Your Points of Difference. In marketing, we call it a USP — a "Unique Selling Proposition". What makes you different from anyone else? That's what people will remember, and you can use it to your advantage.
  3. Develop a Narrative. You used to write award-winning business columns — and now you want to review restaurants? It's human nature to have many interests, to seek new experiences, and to want to develop new skills over the course of your life. Unfortunately, there's a popular word to describe that profound quest: dilettante. It's unfair, but to protect your brand you need to develop a coherent narrative arc that explains to people — in a nice, simple way so they can't miss it — exactly how your past fits into the present. "I used to write about the business side of many industries, including food and wine," you could say. "I realized my big-picture knowledge about agricultural trends and business finance made me uniquely positioned to cover restaurants with a different perspective." It's like a job interview — you're turning what could be perceived as a weakness (he doesn't know anything about food, because he's been a business reporter for 20 years) into a compelling strength that people can remember (he's got a different take on the food industry because he has knowledge most other people don't).
  4. Reintroduce Yourself. The vast majority of people, regrettably, aren't paying much attention to you. That means their perceptions are probably a few years out of date — and it's not their fault. With hundreds (or thousands) of Facebook friends and vague social connections, we can't expect everyone to remember all the details of our lives. So we have to strategically re-educate our friends and acquaintances — because, especially if we're launching a new business venture, they're going to be our buyers and recommenders. That means a concerted effort to phone or email everyone on your list — individually — to let them know about your new direction and, where appropriate, ask for their help, advice, or business.
  5. Prove Your Worth. There's a difference between my knowing that you've launched a new graphic design business and trusting that you'll do a good job for clients. I may like you a lot, but unless I see proof of your skills, I may hesitate to put my own reputation on the line by sending you referrals. That's where blogs, podcasts, videocasts, and other forms of social media come in. It's critical to let potential customers see what you're about and test drive your approach before they make a large commitment. Checking out your image gallery and seeing a roster of attractive corporate logos you've designed may allay my fears enough to send you that major new account.
So what are your best strategies for reinventing your personal brand?

Monday, June 10, 2013

PMP and ITIL’s effects of IT Biz…

So, you want your biz to grow, prosper, and if you want your organization to be truly effective, certification in both the ITIL framework and the project management framework will help your company prosper. Implementing various levels of checkpoints (audit & control) along the way ensures that you keep your services up-to-date and performing satisfactorily to meet customer need and demand. Ignoring one framework can make IT projects harder to carry out, thus wasting much needed time, money, and skills & most importantly human resources. ITIL and PMP can have a strong positive impact on how IT departments support the biz. At this time, however, many IT organizations are still slow to understand the power that both the ITIL framework and project management framework (when combined) have on ensuring that projects are finished and implemented in a timely manner. When you choose to integrate two highly efficient measures of accountability and risk management, you choose to have an organization that not only runs smoothly but is ahead of the competition at every turn in the road.

Wednesday, May 15, 2013

Change Management Do's and Don'ts

The goal of a successful Change Management process implementation is to reduce the amount of unplanned work as a percentage of total work done. Organizations that are in a constant firefighting mode can have this percentage at 65 percent or even higher.
The first phase of a Change Management implementation as outlined in the Visible Ops Handbook (a useful guidance and prescriptive roadmap for organizations beginning or continuing their IT process improvement journey) resembles the triage system used by hospitals to allocate scarce medical resources. In a similar fashion, IT must identify the most critical systems generating the most unplanned work and take appropriate action to gain control.
The primary goal of this phase is to stabilize the environment, allowing work to shift from perpetual firefighting to more proactive work that addresses the root causes of problems. Start by identifying the systems and business processes that generate the greatest amount of firefighting. When problems are escalated to IT operations, which servers, networking devices, infrastructure or services are constantly being revisited each week (or worse, each day)?
These items are your list of "most fragile patient" which are generating the most unplanned work. These are the patients that must be protected from uncontrolled changes, both to curb firefighting and to free up enough cycles to start building a safer and more controlled route for change.
For each fragile patient (i.e. server, networking device, asset, etc.), do the following:
Reduce or eliminate access: Clear everyone away from the asset unless they are formally authorized to make changes. Because these assets have low change success rates, you must reduce the number of times the dice are rolled.
Document the new change policy: keep the change policy simple: "Absolutely no changes to this asset unless authorized by me." This policy is the preventive control and creates an expectation of behavior.
Notify stakeholders: After the initial change policy is established, notify all of the stakeholders about the new process. Make sure the entire staff sees it: Email it to the team, print it out, and add it to login banners.

Create change windows: Work with stakeholders to identify periods of time when changes to production systems can be made. Your goal will be to eventually schedule all changes into these maintenance windows. Amend the change policy accordingly. For example, Once I authorize the changes, I will schedule the work to be performed during one of the defined maintenance windows on either Saturday or Sunday between 3 and 5 pm.
Reinforce the process: By now, you have defined a clear expectation of how changes are to be made. Make sure that people are aware of the new process and reinforce it constantly. For example; "Team, let me be clear on this: These processes are here to enable the success of the entire team, not just individuals. Anyone making a change without getting authorization undermines the success of the team, and we'll have to deal with that. At a minimum, you'll have to explain why you made your cowboy change to the entire team. If it keeps happening, you may get the day off, and eventually, it may prevent you from being a part of this team."

Electrify the FencePut a fence around the systems where unauthorized changes are causing the most carnage. Next, you will electrify the fence in order to keep everyone accountable and responsible for playing by the rules.The goal is to start creating a culture of change management.
To do this, proper change monitoring must be in place so you can trust, but verify. You will use this instrumentation to detect and verify that changes are happening within the specified change management process, and also to negatively reinforce and deter changes that are not.
You must be aware of changes on all infrastructure that you are managing: servers, routers, network devices, databases and so forth. Each detected change must either map to authorized work, or it must be flagged for investigation.
Critical questions that need to be answered are:
Who made the change?
What did they change?
Should it be rolled back? If so, then how?
How do we prevent it from happening again in the future?
The key to creating a successful culture of change management is accountability. If the change process is repeatedly bypassed, management must be willing to take appropriate disciplinary action.

Create the Change Team
In this next step, you will continue to develop the change management process by creating a Change Advisory Board (CAB), comprised of the relevant stakeholders of each critical IT service. These stakeholders are the people who can best make decisions about changes because of their understanding of the business goals, as well as technical and operational risks.
Create a Change Request Tracking System
A prerequisite for any effective Change Management process is the ability to track requests for changes (RFCs) through the authorization, implementation, and verification processes.
Paper-based manual systems quickly become impractical when the organization is large or complex, or when the number of changes is high. Because of this, most groups use some computerized means to track RFCs and assign work order numbers. Some refer to these applications as ticketing systems or change workflow systems. The primary goals of a change request tracking system are to document and track changes through their lifecycle and to automate the authorization process.
Secondarily, the system can generate reports with metrics for later analysis. Each change requester should gather all the information the Change Manager needs to decide whether the change should be approved. In general, the more risky the proposed change is, the more information that is required.
For instance, a business as usual (BAU) change, such as rebooting a server or rotating a log file, may require very little data and oversight prior to approval. On the other hand, a high-risk change such as applying a large and complex security patch on a critical production server may not only require good documentation of the proposed change, but also extensive testing before it can even be considered for authorized deployment.
Start Weekly Change Management Meetings (to Authorize Change) and Daily Change Briefings (to Announce Changes)
Now that you have identified the change stakeholders by creating the CAB, the next step is to create a forum for them to make decisions on requested changes.
The CAB will authorize, deny, or negotiate a change with the requester. Authorized changes will be scheduled, implemented, and finally verified. The goal is to create a process that enables the highest successful change throughout the organization with the least amount of bureaucracy possible.
While they may seem unnatural at first, with practice, weekly 15 minute change management meetings are possible. Take special care to avoid an attitude of "just get it done," which allows people to make changes that circumvent the change approval process.
If you make it easy for all changes to flow through your process, it will soon be easier to use the process than to circumvent it, even during emergencies. CABs must meet on a regular published schedule that all stakeholders understand. To start, each CAB should meet weekly.
Miscellaneous Change Management Do's and Don'ts
Here are some tips for change management.
Items to do:
Do post-implementation reviews to determine whether the change succeeded or not.
Do track the change success rate.
Do use the change success rate to learn and avoid making historically risky changes.
Do make sure everyone attends the meetings, otherwise auditors have a good case that this is a nonfunctioning control.
Do categorize the disposition of all changes. In other words, all outcomes must be documented once a change is approved. Three potential outcomes are:
Change withdraw - the change requester rescinds the change request along with the reason why. This should not be flagged as a failed change in change metrics.
Abort - the change failed, accompanied by documentation of what went wrong.
Completed successfully - the change was implemented and is functioning appropriately.

Items not to do:

Do not authorize changes without rollback plans that everybody reviews. Changes do fail, so be proactive and think ahead about how to recover from a problem rather than attempting to do so during the heat of firefighting.
Do not allow rubber stamping approval of changes.
Do not let any system changes off the hook - someone made it, so understand what caused it.
Do not send mixed messages. Bear in mind that the first time the process is circumvented, incredible damage can be done to the process. "Well heck, we did it last time" or the boss said, "just do it" - both send the wrong messages.
Do not expect to be doing closed loop Change Management from the start. Awareness is better than being oblivious, and managed is better than unmanaged. Start with a particular class of changes and constantly refine the process.
The Spectrum of Change:
The management of change is an evolutionary process. Groups should not become discouraged as they start developing their change management processes. The solutions may require changing people, processes, and technology. The following illustrates the stages of change management:
Oblivious to change - Hey, did the switch just reboot?
Aware of change - Hey, who just rebooted the switch?
Announcing change - Hey, I'm rebooting the switch. Let me know if that will cause a problem.
Authorizing change - Hey, I need to reboot the switch. Who needs to authorize this?
Scheduling change - When is the next maintenance window? I'd like to reboot the switch then.
Verifying change - Looking at the fault manager logs, I can see that the switch rebooted as scheduled.
Managing change - Let's schedule the switch reboot to week 45 so we can do the maintenance upgrade and reboot at the same time.
The Change Management goal is to reduce the amount of time spent on unplanned work down, by reducing the number of self-inflicted problems and modifying how problems are solved so that change is ruled out early in the repair cycle. By increasing the change success rate and reducing MTTR, you not only decrease the amount of unplanned work, but also increase the number of changes that can be successfully implemented by the organization.

Friday, April 26, 2013

Key Differences Between ITIL v2 and v3

By now you must have read numerous articles explaining that ITIL v3 is really just an extension of the previous version and that the underlying principles and processes have not really changed, but rather have been refined in places.
The same articles may have also stated that a primary rationale behind the refresh was that ITIL v2 was heavily process-focused. In contrast, ITIL v3 is centered on a service lifecycle approach to help IT departments focus on providing business value. However if you are like us, you may have finished reading those articles and still asked yourself, “What are the key differences between ITIL v2 and V3?” And, even more important, “How does the new version affect my ITIL implementation? Do I need to switch over to V3? How quickly?”
The simple answer is: Keep doing what you’re doing. If your organization is in the middle of an ITIL v2 implementation, you do not need to change. The expanded elements of ITIL v3 are, in many cases, best-practice activities your organization is already following even though they were not explicitly described in ITIL v2. However, if you have not yet started your ITIL journey, there is no reason not to start with the latest version. Finally, organizations that have already completed their ITIL v2 implementation, will find it useful to take advantage of the new version as they proceed with ongoing improvements to their IT service management approach.
That being said, for those interested in better understanding the differences between ITIL v2 and ITIL v3, below I’ve provided a detailed comparison.
Topics Realignment
The most obvious change is the format of the library itself. The ITIL v2 library was presented in seven core books: Service Support, Service Delivery, ICT Infrastructure Management, Planning to Implement Service Management, Application Management, The Business Perspective and Security Management. Most IT professionals focused on the first two books—which are sometimes referred to by their cover colors, as “the blue book” (Service Support) and “the red book” (Service Delivery).
The blue book deals with best-practice processes for day-to-day activities while the red book deals with best-practice processes for forward-looking activities. They offer guidance as to how organizations can improve their processes to work smarter, but do not particularly align the processes discussed with larger business requirements. The other five books touch rather lightly on a variety of ITIL process issues, and are considered somewhat esoteric even by ITIL experts.
In contrast, the ITIL v3 has been organized into five new books: Service Strategy, Service Design, Service Transition, Service Operation and Continual Service Improvement. These books follow a more practical order:
Expansion of Process Descriptions
In ITIL v3, the key concepts of Service Support and Service Delivery processes outlined in ITIL v2 have been preserved. They have, however, been augmented with 12 new processes. This can best be seen by looking at all 22 processes visually combined in the new structure. (Note: Processes covered in the ITIL v2 “blue book” (Service Support) are labeled (B) and processes discussed in ITIL v2 “red book” (Service Delivery) are labeled (R).
Service Strategy (Book 1)
Financial management – No material changes from V2.
Demand Management – ITIL v2 discussed concepts of Demand Management within the context of Capacity Management. However ITIL v3 introduces the process of Demand Management as a distinct process and as a strategic component of service management.
Service Portfolio Management – ITIL v2 only discussed Service Level Management. ITIL v3 represents a fundamental rethinking of services, recognizing the need to conceptualize and identify a portfolio of services before dealing with the specifics of levels of service.
Service Design (Book 2)
Service Level Management – No material changes from ITIL v2 in Service Design book. Also covered in Continuous Service Improvement (Book 5).
Availability Management, Capacity Management and IT Service Continuity Management – No material changes from V2.
Service Catalog Management - A new process that consolidates and formalizes the processes around ensuring that a service catalog is produced and maintained, and that it contains accurate information on all operational services and on those being prepared to be run operationally. In addition, V3 identifies the need for two interdependent parts of the service catalog, namely an “external” business catalog of (business) services as described and recognized by end-users, and an “internal” technical catalog of the tools, processes and procedures required to support those services.
In ITIL v2, the concept of a service catalog was mentioned, but no process was outlined for its creation or maintenance, nor was the distinction made between a business catalog and a technical catalog.
Supplier Management - A process for ensuring that all contracts and suppliers support the needs of the business and that they meet their contractual agreements. Supplier management was covered in ICT Infrastructure Management in ITIL v2.
Information Security Management - A comprehensive process designed to address the strategic direction for security activities and to ensure that objectives are achieved. It includes the confidentiality, integrity and availability of an organization's assets, information, data and IT services. Information security management was covered in very limited form in its own book in ITIL v2. ITIL v3 brings this topic up-to-date with current information security concerns and brings it into better alignment with related issues facing IT organizations.
Service Transition (Book 3)

Transition Planning and Support is a unified process for coordinating all the activities of Change Management, Configuration Management and Release Management described in ITIL v2. It has now been expanded and is presented alongside the related topics of Service Validation and Testing (i.e. testing a new change), Evaluation (managing risks associated with a new change) and Knowledge Management (gathering, analyzing, storing and sharing knowledge and information).
All seven process descriptions have been expanded. In ITIL v2, Knowledge Management was discussed separately, in the Application Management book.
Service Operation (Book 4)
Incident Management and Problem Management – No material changes from ITIL v2.
Event Management is a stand-alone process for detecting and managing events and alarms (which ITIL calls “exceptions”). In ITIL v2, Event Management was covered under Incident Management.
Request Fulfillment is a new process for managing the lifecycle of all customer- and user-generated service requests. These types of requests include facilities, moves and supplies as well as those specific to IT services. In the previous version of ITIL, this process was covered under Incident Management. A notable difference in ITIL v3 is it now recognizes the existence of service requests beyond merely “break/fix” restoration of service.
Access Management is a new process that provides rights and identity management related to granting authorized users the right to use a service, while restricting access to non-authorized users. In ITIL v2, Access Management was covered in the Security Management Book.
Continual Service Improvement (Book 5)
The ITIL v2 red book described a Service Improvement Program (SIP) within the context of Service Level Management, discussing best-practices for measuring, monitoring and reporting on services which ultimately provided the data for making improvements. ITIL v3 expands this into its own book, Continuous Service Improvement, and structures a seven step improvement process as follows:
1. Define what you should measure;
2. Define what you can measure;
3. Gather the data;
4. Process the data;
5. Analyze the data;
6. Presenting and using the data; and
7. Implement the corrective action.
Function Comparison
ITIL v2’s Service Support book identified the service desk as the lone function (group of people and the tools they use to carry out one or more processes or activities). ITIL v3 now identifies three other functions in addition to the service desk: Technology Management, IT Operations Management, and Application Management.
The Technology Management function provides the detailed technical skills and resources needed to support IT services and management, along with the ongoing operation of the IT Infrastructure. The IT Operations Management function is responsible for the daily operational activities needed to manage IT services and the IT Infrastructure. It includes IT Operations Control and Facilities Management. The Application Management function manages applications throughout their lifecycle.
Summary
The key changes include: A consolidation of the library into five books; the addition of 12 new processes and three new functions; emphasis on service management through the entire service lifecycle; and emphasis on creating business value versus simply improving the execution of processes.
Supplemental material will be released over time. Whether now is the right time to introduce the new version into your organization or continue with the old depends on the current state of the organization. That said, since most of the changes to ITIL v3 clarify and augment the previous library, a good case can be made to start using the new and refreshed library.

Monday, March 18, 2013

The Eight Essential Elements of an IT Service Lifecycle

Many companies are attempting to run their IT organizations like a business. They are viewing internal business units as “customers” who require maximum value from their IT investments, rapid response to their needs, consistent service levels, and full visibility into technology costs. In particular, senior executives are demanding this cost visibility to exercise appropriate financial controls and more accurately benchmark service costs.
To adopt this strategy, organizations must understand the business—including visibility into the defined or documented service offerings and business processes the IT organization delivers to the organization.
It is here the IT service lifecycle (ITSL) can help. It can serve as a framework to help define, publish, and improve service offerings by redefining an IT service in the context of a dynamic business environment. The ITSL includes eight essential elements:
1. Definition- Service definition is the most important element of the ITSL. The definition of a service begins by documenting the intended business goals, policies, and procedures. These include the desired functional requirements, such as the need to produce a service report or generate an invoice. Non-functional requirements such as service availability, performance, accuracy, and security, also must also be considered. Services need to be described in a meaningful way to the user. For example, “300 Gb Ultra SCSI hard drive in RAID 5 array,” could easily be described as “secure data storage” to non-technical staff.
2. Publication- Once defined, the definition is published in a service catalog. The goal of the service catalog is to create a vehicle that enables users to proactively select the IT services that best suit their needs.
Many companies publish several service catalogs created by various internal groups. This can be confusing to users who need to be aware of multiple locations to make requests. The trend is to create and publish a single common user interface containing a central repository of all IT service offerings, regardless of the internal or external service provider.
3. Request Model- A service provider interacts with end users or business units through the end-user request model or the subscription model.
The end-user request model enables the user to select services from a published catalog. The goal is to automate those human steps necessary to deliver the service to the end user, including required and often time-consuming “hand offs” such as multi-departmental reviews and approvals.
The subscription model automatically delivers a standard set of services according to a pre-arranged service level agreement (SLA). For example, when creating an email account for a new employee, a specific request is made to subscribe to this service on behalf of the employee.
These two interaction mechanisms can be employed separately or together. For example, a user-facing catalog may contain a service called “add new hire", which in turn enables the delivery of a continuous email subscription.
4. Provisioning- IT service provisioning enables the automated delivery of services selected from a catalog, such as setting up an email account, software installation, or providing access to a specific application.
Business processes must first be discovered and appropriate integration touch-points documented. This helps organizations identify those limitations in technology and process that will typically require a phased approach for successful implementation of an automated solution.
5. Measurement- Organizations often include measurement as part of their ITSL requirement. This enables them to meter service consumption for billing, monitor automated service levels and deliver detailed service usage reports.
Even with sophisticated instrumentation methods, it is often difficult to allocate costs to an individual client or to correlate usage data with a single business unit. For example, wireless networks complicate the issue of cost allocation for network bandwidth usage, and shared services models mean that multiple clients can consume a specific resource simultaneously.
6. Cost Recovery Process- Cost recovery or chargeback takes many forms. If an organization employs a usage-based methodology, appropriate instrumentation is vital to achieve an accurate calculation. Another approach is to utilize standardized accounting methodologies such as high-level allocation or low-level allocation to distribute direct and indirect service costs.
Regardless of the cost recovery process, integration with an external financial system or ERP solution is often required. These systems generate journal entries for recovery costs to facilitate the billing process.
7. Assessment- Assessment is a vital element in the ITSL. As business requirements dynamically change any number of these elements may have to be modified to realign IT goals accordingly.
8. Ongoing Process Improvement- Business processes and service definitions are both abstract and dynamic. In the ITSL, each service is an opportunity for continuous process improvement, not a static deliverable.
Valued-Added, Strategic Business Partner
Using an iterative approach of assessment and improvement will yield the ultimate goal of managing the ITSL from a business perspective. This is the Holy Grail of efficiency and effectiveness that enables IT to run like a business.
A business-centric approach toward IT can yield substantial cost savings and significantly improve the alignment of IT spending with business need. By achieving accountability with these eight elements, you can help transform your IT organization from a reactive, cost-focused department to a responsive, value-added, strategic business partner.

Thursday, February 14, 2013

What “Change Management” bring to any org.

Today’s organizations are faced with unending competition, changing circumstances and increased customer demands. To remain viable and competitive, their IT and services organizations must be in complete alignment with the strategic goals of the organization.
In businesses, this means that IT has to be a partner in delivering value to the customer. One of the important challenges of doing this is to ensure that changes are implemented without disrupting the delivery of that value to the customer. ITIL Change Management (CM) provides the necessary framework to guide organizations in meeting this challenge.
While no methodology can guarantee absolute success, a standardized Change Management process with clearly defined roles and responsibilities, increases the likelihood that the business objectives and goals are successfully achieved and limits the likelihood of embarrassing and costly mistakes in implementation.
ITIL Change Management is one of the most effective ways to provide stability to the IT organization. It is the linchpin of any ITIL implementation because it both controls many of the ITIL processes and also ensures the other processes are not attempting to work with an unstable environment.
I have compiled a list of Top 10 Tips to use when implementing an ITIL Change Management process after lots of binging.
1. Clarify what Change Management will accomplish in your organization. Many corporations struggle with defining ITIL in general and Change Management in particular. The most common misunderstanding is the assumption that implementing Change Management will fix issues that are related to Release Management or Configuration Management.
Change Management focuses on the oversight and approval aspects of the process, ensuring that only authorized changes are being worked on. It is more related to organization change than the operational aspects of change.
2. Articulate the benefits of Change Management to each level of the organization. Using a top-down organizational approach is usually the most effective way to establish Change Management. When the leaders of an organization demonstrate the commitment and participation to implement a Change Management program, the better the chance for success.
Getting buy-in at all levels is critical to the success of the program. The first step to achieving a successful buy-in is identifying all stakeholder groups that are affected by such an implementation.
Stakeholders need to understand the benefits on a personal and organizational level (What’s in it for me?). Clearly defining and presenting to each stakeholder what those benefits will be, and conversely, establishing and enforcing policies that address the penalties and repercussions for bypassing the process is essential. Finally, to ensure buy-in and understanding among everyone, be sure to communicate the same message to everyone involved as to what those policies will cover.
3. Define what a Change is. The most important concept to convey is that everything in the IT world can have a change element to it. Nothing should fly under the radar.
All Installs, Moves, Adds and Changes (IMAC’s) to the infrastructure, and any software changes should fall under the control of Change Management. Even the most seemingly innocuous changes can cause major disruptions if no one knows about them. This can be especially true if you are implementing Change Management in an immature, silo-structured organization.
4. Establish clear roles and responsibilities for the Change Advisory Board (CAB), Change Manager, and Change Authority. Creating an Executive Committee for the CAB is a good way to keep the executives engaged in the process without subjecting them to the low level details that change management sometimes involves. Having executive sponsorship increases your leverage when encountering parties resistant to changing the status quo.
An effective and successful Change Manager is one who proactively ensures that the right resources, both technical and business, attend the CAB and present viable, justifiable changes. The Change Manager can be the final arbiter in resolving disputes over classifications and prioritizations. Ensure that the Change Authorities who are representing changes to the CAB are well-informed and can speak to their items when challenged. Their role is to present the business case, the impact analysis, the resource plan and execution plan for each change.
The CAB is not just an IT operation. A successful CAB will have a wide rotating mix of participants from the IT, Operations and Business groups. Embrace the flexibility that the CAB offers by limiting the standing participants and ensuring only those resources that can add value to the discussions are invited to the meetings.
5. Create standardized processes and timeframes to support Change Management. Defining processes and timeframes up front around Major, Minor and Emergency changes will assist in managing client expectations as to when, and how, changes will be delivered.
Getting the senior members of the Change Advisory Board (CAB) to sign off on the criteria is essential to reducing the noise level. Define the boundaries around priorities and hold to them. By having standard change processes understood, there will be fewer circumventions of the system.
If your Change Management scope is significant, and incorporates IMAC’s as suggested above, there needs to be a means of expediting the process to overcome the bureaucracy that may prevail. Designing and implementing a standard change model for those changes where the risk level is already well understood, allows for such expediency.
6. Establish and Stabilize the Change Management Process before introducing tools. In theory, it seems logical to buy a tool that can guide your change management implementation and utilize it as a key component of your change program. In practice, this approach is rarely effective. Introducing new processes, making them more efficient and finalizing them will lay the groundwork for defining the requirements for a tool selection. You can then better evaluate a tool fit for your purposes instead of getting lost in the various options that most tools present.
7. Define Key Performance Indicators (KPI) and Critical Success Factors (CSF) that highlight the improvements that Change Management brings to the organization. Bring metrics to Senior Management’s attention on a regular basis showing how CM is benefiting the organization.
Sample CSF’s should reflect that CM is:
A repeatable process that can make changes quickly and accurately
Protecting the integrity of the service when making those changes
Delivering process efficiency and effectiveness
Sample KPI’s should be established around:
Reduction of unauthorized changes
Reduction in change related outages
Reduction in emergency changes
Actual cost of a change vs. planned (budgeted) cost

8. Ensure back-out plans are documented and realistic. Although no one ever intentionally introduces defects into the production environment, it is a fact of life that problems will sometimes arise as a result of new submissions. To combat these instances, there must be a robust contingency plan in place to minimize the amount and length of production outages. Ensuring that the Release Management team comes prepared to the CAB with both their implementation plan and back-out strategy is an essential check-point for the Change Manager.
9. Accentuate the positive by building on successes and leveraging lessons learned. Discussing lessons learned, whether good or bad, is important for everyone involved to better prepare for the next instance. While it is important to correct bad behaviors after a release, it is just as important to highlight what went well. Showcase success stories and integrate lessons learned into plans for further roll-outs.
10. Use the Change Management Initiative to promote other ITIL processes. Many organizations are only familiar with the Change Management component of ITIL.
· Use the success story from implementing Change Management to promote the benefits of the other processes and how it will improve the overall performance of IT. Change Management cannot be truly effective in isolation. · When Release and Configuration Management processes are absent, consider combining all three into a centralized function. The three processes have many close links to each other and together can stabilize an organization’s production environment.
In summary, implementing Change Management is and should be viewed as a major strategic undertaking. It is much more than a simple process roll-out. As a starting point, organizations need to know where they stand in terms of ITIL maturity, where gaps exist, and where they want to be.
Any ITIL implementation is a major change program that warrants a roadmap, a realistic project plan and associated communications to achieve the desired outcomes. It also requires training of the support organization as well as the users receiving the service on new processes and procedures. Piloting the new processes or performing dry runs will furthermore ensure smooth transition and higher effectiveness.
Using items such as these Top 10 Tips to aid understanding throughout the support organization is a good starting point for design and deployment of an ITIL-based Change Management Process. In the end, implementing change management is a key step in establishing the stability you need to make your other ITIL processes and your organization more successful.

Thursday, January 10, 2013

Use ITIL to Enhance Your Disaster Recovery Capability

You spent time developing an actionable plan, assigned responsibility to various personnel and created confidence that business systems will be available during times of disaster. But, what happens on Day 2 after the plan is put in place? The plan must be maintained, tested and revised to stay aligned with business changes and increasing risk. Maintaining the plan becomes one of the most critical initiatives to reduce risk in the future.
One method to ensure your plan is kept current is to integrate your BC/DR efforts with the best practices contained in the IT Infrastructure Library (ITIL). Various processes within ITIL integrate seamlessly with BC/DR and, when implemented properly, can enable the ongoing maintenance of your BC/DR plan. Processes such as Service Level, Incident, Change and Configuration Management stipulate activities that can help ensure your plan stays current, regardless of the dynamic nature of the business.
The service lifecycle contained in ITIL v3 can provide for the ongoing improvement of your BC/DR plan over time. Let's take a look at some of the activities contained within ITIL that you can use to enhance your BC/DR capability:
Service Level Management (SLM) – Activities contained in this process establish the guidelines to design and implement services within your organization to ensure that the business and IT are aligned. Whether you are adding new services or maintaining existing services, continuity is a key component that must be addressed.
When determining your strategy for delivering services, its effect on your BC/DR plan needs to be addressed. Including continuity discussions when defining services and drafting service level agreements ensures that the business understands how the service will be recovered during disasters and other associated risks. Your service level manager should have a detailed understanding of the contents of your BC/DR plan in order to ensure that changes to business services that impact the plan are taken into consideration when revisions to the service catalog are made.
Incident Management (IM) – In its simplest form, an incident is any event that results in a service becoming unavailable or deteriorated. Disasters are major incidents that require the organization to follow an established workflow to restore the service to an acceptable and agreed upon service level. The process to detect, record, diagnose, resolve and close these incidents must be established in order to effectively manage the incident.
Integrating the contents of the BC/DR plan into your incident management process will ensure that the activities of the plan are executed in a timely and organized manner. Establishing the process that will be followed when these incidents occur will ensure that all the stakeholders involved will be notified of their responsibilities.
Service Desk – There are various service desk technologies available today that can be used to enable the process to work more effectively. The service desk is a tool you can use to create templates that will be followed to document the incident and establish the workflow that will be followed.
The BC/DR plan will stipulate the activities to be followed during a disaster and the service desk will be the place where the workflow will be documented. The standard template can be initiated, allowing each person with responsibility for the incident to be notified so they can take the appropriate actions established in the plan.
Configuration Management Database (CMDB) – All BC/DR plans contain a detailed list of the configuration items making up the critical services that need to be recovered during a disaster. Inclusion of the configurations in the CMDB will ensure that the information is available to all the required parties during the disaster. The CMDB will also contain all the attributes of the configuration items and establish ownership, control mechanisms, status, and verification requirements that will be needed to maintain the information so it stays current.
The CMDB is an important component for Incident and Change Management in order to determine the appropriate scope of outages and changes. Any changes to the configurations of critical services will have a direct impact on the BC/DR plan and should be annotated as such in the CMDB.
Availability and Capacity Management – As the service catalog is developed, service level requirements will determine the level of availability and capacity needed to reduce risks associated with the infrastructure. These plans provide critical input to the BC/DR plan in that they indicate the level of redundancy and capacity requirements needed to ensure continuous service operations. Availability and capacity managers must be consulted when developing the BC/DR plan so the level of capability can be defined appropriately. A detailed process in this area will lead to a more effective BC/DR plan.
Continual Service Improvement – In order to ensure that IT transforms itself toward a higher level of maturity, service improvement plans must be developed and maintained. A key component of drafting an effective plan is identification of the impact improvements (changes) have on the BC/DR plan. A detailed process for coordinating and determining the effect on the BC/DR capability is critical in ensuring that when improvements to the plan are made, all stakeholders are informed and the impact on the business is considered.
Summary-
Integrating your ITIL program with your BC/DR capability can provide the business with the assurance that IT is considering all aspects of maintaining continuous service operations. The list of correlating processes between ITIL and BC/DR contained in this article are not all inclusive, however, they represent a good start toward transforming and aligning IT with the business.
When developing your BC/DR capability, it is a good practice to consider ITIL as a means to enhance your plan by taking the time to integrate your efforts and consider the effects of Services, People, Process and Tools. The time you spend to ensure effective maintenance of your BC/DR plan using the ITIL best practices will go a long way to ensure continuous business operations before, during and after a disaster.