Performance Testing and Resolution Case Study
The Client
The client is a global shipping company that serves the world’s leading oil and gas companies.
With over 6000 employees in offices worldwide and a fleet of over 150 ships, the client requires consistently reliable software support and maintenance.
The Challenge
Our client had a mission critical system that managed the accounts payable workflow in their worldwide offices. Every month, offshore AP clerks scan over 3000 invoices into the system and route them to the appropriate approvers in other offices. The system was crashing intermittently and consistently freezing and/or crashing in turn, causing 5-15 minute delays per incident. With 8 to 12 AP clerks losing 5 to 15 minutes several times throughout the day, hundreds of working hours were being lost each month.
Our client needed a partner to troubleshoot the issue, identify the cause(s), and recommend and implement a solution. Their system was quite complex; it consisted of several servers, databases, file servers, web services and scheduled activities.
Key Challenges
– Optimus did not develop or implement the system. We had to ramp up on a third-party system and understand it in-depth to troubleshoot.
– There was no clear pattern to the system failures.
– The system failures were only regularly occurring in offshore locations and not reproducible onshore.
– The system consisted of several layers of technologies: from servers, to databases, to web-services and virtualization solutions.
The Process
- Identify problem and establish success criteria.
- Analyze system and benchmark performance.
- Optimize systems.
- Deploy, test and maintain the system.
How Optimus Helped
Optimus has an ongoing relationship with the client, therefore creating a resource to troubleshoot this issue was not difficult. The project began with a sit down with the application’s business and technical owners to understand the issue and determine the acceptable success criteria.
We began looking for the cause of the issue by reproducing it and establishing benchmarks that measured improvements. To reproduce the problem, we setup a series of 7 workstations with automation scripts and bandwidth limiters to replicate the work being done offshore. By doing this we identified the precise conditions that triggered the crashes. After the problem was successfully reproduced in a test system, we systematically searched for the failure.
Ranging from the application to database server, our team found several areas to improve. Notably, large attachments were moved from the database on to an application server. We then cleaned up the server and optimized its performance levels. Then our team reconfigured the web server to handle the specific load type better.
Once the servers were cleaned up, reconfigured, and optimized for their specific loads, the system stabilized and there were no more regular crashes.
AP clerks were able to process invoices more efficiently since they were not interrupted by system failures several times during the day. Also, invoices are now processed using fewer resources.
Optimus also provided the client with an ongoing maintenance schedule that keeps the systems performing as expected.