Healthcare

Case Studies

How do we do it?

Download this PDF and find out!

Case Studies

download PDF

Enkitec Rapidly Implements Oracle Exadata for National Healthcare Provider

Business Problem

In April 2011, Enkitec assisted a national health care provider in migrating their production databases to Exadata. In this case, Exadata was selected as a consolidation platform for eight (8) production databases running on five (5) servers.  One of the driving factors of the project was the decision to discontinue the use of their Storage Area Network (SAN) for Oracle data storage.  This imposed a strict implementation timeline since the SAN was to be removed at the end of the lease term.  This left only four (4) weeks for the Exadata machines to be delivered, installed, and moved into production.

The speed with which Exadata could be deployed was a major consideration in the decision to use it. Another assumption was made – that the Exadata would provide a higher level of performance over the legacy equipment – although no testing of the custom applications was done prior to purchasing the Exadata.  The fact that the Exadata platform included Oracle’s Real Application Cluster (RAC) technology did not play into the purchase decision since the legacy systems were not required to be highly available.

The databases to be migrated to the Exadata were not large, certainly by Exadata standards, with most in the 10 -100 Gigabyte range.  In most cases, the applications were truly OLTP in nature meaning that the main optimizations provided by Exadata (query offloading) would play a relatively minor role.  The first application migrated to the Exadata was a contact management system, which had two distinct components:

  • An OLTP component used to maintain patient information gathered from clinics in multiple locations.
  • An ETL component which would evaluate changes in the OLTP data and push the changes to the service delivery system.

The legacy database host was a one-year-old, Intel-based server with eight (8) CPU cores and 24 GB of RAM running a 10.2.0.5 database.  The database server was connected via Fiber Channel to a top-of-the-line SAN.  This environment was handling the OLTP operations without issue, but the ETL processes required so much CPU that they had to be throttled down to keep from negatively affecting the OLTP application. As a result, the service delivery system often didn’t contain the information required to process a customer in a timely fashion.

Solutions

OLTP Testing

The initial phase of testing presented some interesting challenges.  Some were expected while others were less obvious.  Keep in mind that we were testing an OLTP application whose data was mostly accessed from memory via the standard Oracle Buffer Cache.  The most obvious issues we encountered were related to differences in the performance of certain queries as a result of upgrading from version 10.2.0.5 to 11.2.0.2 of the Oracle database.  This is a routine situation with database upgrades regardless of the versions since the Query Optimizer changes with almost every new release.

Following the database upgrade, the performance of most queries improved, but a small percentage of them exhibited performance regressions.  While Oracle 11g provides numerous mechanisms for reducing or eliminating the possibility of query plan changes, the aggressive implementation schedule did not allow these techniques to be used. Instead, we used the “old fashioned” method.  We identified the poorly performing statements and fixed them.

As a result of this brief tuning exercise, the OLTP application performed slightly better on the Exadata than on the legacy hardware.  The results matched our expectations given that the Exadata had faster CPU’s and most of the queries were being satisfied via Buffer Cache. Since Exadata was designed to optimize disk access, and this system did very little disk I/O, we only expected an incremental improvement in performance.

ETL Testing

As described above, the legacy application included a CPU-intensive ETL process, which ran against the OLTP database.  The ETL process was actually a collection of queries that would look for changes in a corporation’s contact data and push them to the service delivery system in each market (84 in total).  The users of the service delivery system would often be forced to wait for the ETL process to complete before they could check a patient in to their local system.

The ETL process ran for less than 10 seconds per market, but had to be executed for each of the 84 markets.  Since the OLTP and ETL processes were sharing the same database server, the ETL process had to be throttled back to allow the OLTP system to operate.  The result was that the ETL process took up to 20 minutes to complete, which extended the time required to process a patient.  In some cases, the differences in data between the two systems hindered the provider’s ability to realize revenue on the patient visit.

When the ETL process was first run on a single node of the Exadata machine, it completed in about two (2) minutes, that’s 10 times faster than on the legacy system.  Great, right?  Not so fast.  During this run, the CPU utilization on the database server averaged above 90%, which was unacceptable. Since the Exadata Quarter Rack contained multiple database servers, it was decided to partition the workload so that one server was dedicated to OLTP and the other to ETL processing.

This “workload partitioning” solution did not solve the underlying problem presented by the ETL system.  Its CPU requirements on the database server were still too high.  When we looked at the ETL queries, we saw an abundance of Full Table Scans.  On an Exadata, this is what you want to see since the Smart Scan features require Full Table Scans.  What we did not see, however, were queries being offloaded to the Storage Servers.  The reason was that the tables being accessed by the ETL process were considered to be too small to warrant query offloading.

Using our knowledge of both the Exadata platform and the Oracle database, we enabled query offloading for a majority of the ETL queries, which shifted the CPU load from the database server to the Exadata Storage Servers.  Since there are multiple storage servers with multiple cores, the CPU load on the storage servers was significantly lower than that of a single database server.  In the end, the ETL server’s CPU utilization averaged around 25% and the storage servers’ average CPU utilization was about 20%.  The benefit to the business was that the patient wait times were significantly reduced dramatically – from 20 minutes to less than five (5) – and the need to throttle back the ETL processes was alleviated.

Benefits

While this project was described as a consolidation effort, it was also a “speed to market” exercise.  The customer was discontinuing the use of the SAN, which housed their Oracle databases. There was also a hard deadline to move the data to Exadata.  Given that the window of time was so short, Enkitec was only able to improve the ETL processes by taking advantage of Exadata’s built-in optimizations.

The application development team was not able to re-architect any of their processes since their change control cycle was relatively long, requiring weeks to implement code changes. This particular application obviously did not hit the sweet spot of many of Exadata’s built-in advantages due to the relatively small volumes of data. Nevertheless, the speed of implementation and the built-in advantages of the platform allowed for a successful outcome, and provided the business with a much more functional system.

Overall, the transition to Exadata was a success and the application development team now knows how to exploit the features of Exadata.  We fully expect to see the runtimes for the ETL portion of this application decrease in the future as the developers are able to make better use of the available features of the Exadata platform.