MAKING SURE SYSTEMS STAND UP IN TOUGH TIMES

Richard Muirhead, CEO, Automic

Within the financial services sphere, the innovation race is on. Customers are becoming increasingly accustomed to having technology at their fingertips, meaning institutions must now be more nimble than ever when it comes to embracing new technologies. Innovating and keeping ahead of the competition — both in terms of other institutions and new non-traditional digital competitors—while meeting growing security, regulatory and compliance requirements, is a critical factor for and presents a huge challenge.

Software architectures in Financial Services are complex and are getting increasingly more complex with the advent of the digital age. Off the shelf and custom applications, which are needed for financial institutions to exist, mean complexity is a fact of life. With the convergence we’ve seen over the last decade and the explosion of connected devices, this complexity is on a steady and continual rise. So what’s the problem? It isn’t the complexity itself. It appears that outages are always related to change in some way.

The frequency of change is increasing, along with the number of components and number of people involved in any given. This can present huge challenges for financial institutions as the try to innovate while striving to introduce new functionality quickly. This increases their risk exposure and can result in failures.

There are some key areas that financial institutions need to address to help guard against the risk of IT failures:

1. Centralised packaging and tracking

Software updates and applications are often developed by globally distributed teams that are disconnected from one another physically and technically. For example, the CRM portal and the point-of-sale application can be developed by different teams, who use different technology stacks and are located on different continents. Centralisation is key, and interdependency between these artifacts should be tested and validated as early as possible, well before an application is rolled out to the operations environment.

Centrallised automation for packaging and transporting the correct artifacts can cut out the human error element, which may occur if deployments are handled manually, and decrease the risk for failure.

2. Standardised processes

Richard Muirhead
Richard Muirhead

Deploying application changes means following specific guidebooks in an exact order. For example, a database must be updated with the new schema before the application server can be activated. Similarly, as the number of environments, server and application tiers grow, so does the length of the deployment instructions. Specifically in mission critical applications, the upgrade or deployment procedure can be so complex that there is a complete book for administrators to follow.

In this case there often isn’t time or resources to deploy many times during the testing cycles. Here administrators often try to automate with scripts and configuration tools available, but the slightest mistake is often fatal. A standardised and well-tested deployment capability is key to avoiding these kinds of issues.

3. Abstracted deployment models

Every application an enterprise uses goes through a lifecycle of development, functional testing, integration and load testing, and finally production rollout. With each stage, the application is put on a different set of servers. The trouble here is these environments are many times larger and more complex in terms of hardware, operating systems and even the application infrastructure stack. It is not uncommon for an application to be developed with JBoss on Windows at the developer’s station and end up in production, executing under Weblogic on Linux, for example.

The implementation of an abstracted deployment model is important to ensure that even subtle differences in deployment don’t negatively impact the operation.

4. Snapshot validation stage

Every update to an application creates a new baseline of configuration, in terms of artifact versions, as well as values continuously updated, as part of many deployment processes.

Connection pool sizes and other configurations for example may or may not become mission-critical with the next deployment, but this may not be a risk worth taking. Validating the state of an application, file numbers, and the exact version in each directory, as well as configuration values in key configuration files, can mean the difference between success and outage. Emergency patches are a real weak point here as they tend to “disappear” with new versions and sometimes cause repeated outages that can take a long time to dissect.

A baseline snapshot at the beginning and end of each new update can resolve many potentially serious issues from occurring.

5. Ability to automatically rollback in time

Perhaps the best piece of advice for when things go wrong in a product environment is to undo whatever has changed and rollback to the previous version. However this can be a very challenging task, particularly in a long and complex update process.

Navigating instructions manually can take too long and in a high pressure environment this can lead to greater damage. As scripts are often used to perform high volume changes, things become impossible to reverse and rollback scripts do not exist. This critical capability can be a “lifesaver” for an institution.

As seen, there are a number of factors which must be considered but agility in new deployment is the biggest consideration – it is not just about software complexity. Extremely complex financial applications work well provided they are untouched. Often the deployment and release processes can be the root of problems and these five considerations all relate to the importance of a purpose-built application and standardised solution to offer the ability to adapt to change efficiently.