Infrastructure Performance Management and the Cloud

By Chris James, Marketing Director EMEA at Virtual Instruments (www.visrtualinstruments.com)

Virtual-Instruments-Chris-JApplication performance is a major concern for any large enterprise considering virtualisation, never mind cloud adoption, with most IT managers being reticent to adopt the cloud for mission-critical applications and data. This is particularly true in the financial services sector where large volumes of confidential data must be accessible on a 24/7 basis, 365 days a year.

While the adoption of leading-edge IT systems and infrastructures in this industry is well documented, until recently there was a reluctance to embrace virtualisation and cloud computing due to the complexity they can bring to infrastructure performance management and especially to data and application migration, both regular and unwelcome occurrences in larger datacentres. IT administrators in this sector often have to balance several factors simultaneously including mergers and acquisitions, new technology roll outs and stringent regulatory compliance.

While a large bank for example, may be comfortable moving parts of its datacentre (e.g. those supporting human resources and marketing) to a virtualised environment, it is far more likely to take much longer in deciding how to manage its business-critical applications. It is not perhaps surprising that when enterprises started to migrate applications to a private cloud computing model they went down the over-provisioning route to reduce the risks associated with migration and consolidation projects. However as its name suggests over-provisioning means buying more capacity than is needed and to do so over a large IT infrastructure is expensive, wasteful and does not necessarily guarantee that a migration will be successful. Furthermore, a recent survey of 200 IT decision makers in the UK and Germany conducted by independent research agency Vanson Bourne for Virtual Instruments, found that approximately 48% of respondents in the financial services market prefer to benchmark their data migrations to help ensure a smooth transition.

Until recently, benchmarking has entailed measuring capacity, utilisation and management of individual physical components. With the move to Virtualisation and new deployment models, Application Performance Management (APM) and Network Performance Management (NPM) tools have been used to try to extend and fill-in for what’s missing. The challenge has been that this does not give a holistic view of the underlying IT infrastructure. So, between all of these tool sets that address device, application and network performance, there is still a huge gap in understanding system-wide performance. This has led to the development of Infrastructure Performance Management (IPM) for assuring performance optimisation, risk mitigation and sustainable service level agreements.(see diagram below)

ipm-cloud

Some 30% of the Fortune Top 100 companies have found, to reduce the risks and successfully benchmark migration, innovative technologies that give full transparency across the IT infrastructure such as an Infrastructure Performance Management (IPM) platform are essential.

IPM can reduce the risk and cost that comes with migrating applications to a private cloud and can ensure critical application performance throughout the process. By using such

technology, application performance can be base-lined before the migration starts and monitored during the move, end-to-end and in real-time.

According to industry analyst reports the market for solutions that address IPM will be worth $9 billion globally by 2015. In addition, when financial organisations migrate business-critical applications such as Exchange or SAP to a virtualised environment there are several best practices to observe, including:

Before the migration starts, find and eliminate connectivity errors. This means cleaning up multi-pathing errors, both in terms of single and unbalanced I/O paths. If you cannot prove that data paths are completely redundant and balanced, they probably aren’t. At the same time, monitor for physical layer issues in the infrastructure, from the VMs to the storage LUNs. Waiting to find and fix physical layer issues after the migration unnecessarily increases risk.

Next, ensure optimal performance by right-sizing the infrastructure components and configuring them properly. Because vendors like VMware report that nearly 90% of performance related problems are due to issues with the storage and its network configuration, spend a proportionate amount of effort there, not just on server side optimisation.

If you optimise utilisation, you will reduce I/O congestion. Good network capacity planning can help maintain networks in optimal working order. It can reduce the risk of outages due to resource limitations, and justify future networking needs. It is important to look for patterns that occur at various times of day: there are often the equivalent of “rush hour” time periods when the I/O traffic will be slowed due to periods of significantly increased demands. In terms of optimising storage capacity, use solutions that can find and report on all underutilised and unused LUNs and switch ports.

Finally, using scenario or what if reporting, you can simulate the effect of a configuration change on throughput or application latency. For instance, one of our customers uncovered a backup job which was going to create a bottleneck if the private cloud migration took place exactly as planned.

Though private clouds can help speed deployments and reduce costs, there is little advantage to the end-user if it increases the risk to application performance and availability. Through de-risking these areas, you can deliver the full benefits of the new compute model and mitigate the risks.

Infrastructure Performance Management lets the IT department be proactive to the requirements of the business, rather than reactive to solving issues. Latency in applications is rarely caused by one issue – it’s usually a combination of small problems that compound to create a big problem. By being able to see the entire infrastructure, in real-time, from virtual machine right down to LUN level within the storage device, and everything in between, the IT department can resolve these small problems before they affect application performance. This approach cannot be taken using the tools provided by the server, switch and storage devices as they only give part of the story. If a problem occurs there is a lot of finger pointing as to who is responsible – which wastes valuable time in resolving the issue. IPM lets you see where the issue is and helps you resolve it quickly. It, together with TAPing (inserting a Traffic Access Point into the system so fibre channel protocol can be read off line) are quickly becoming the defacto best practice for financial services companies wishing to guarantee application performance to the business while simultaneously growing the environment and reducing cost.

 

 

 

 

Comments are closed