Database Performance & Real World Questions

I get questions occasionally from customers.

One recently was, ‘Can Intel Xeon Processors handle a 20TB Oracle database?’

We get this question occasionally and the question doesn’t make any sense to me.  I understand the basis of the question; the customer is concerned that Xeon can tackle a very large database.   Is the question really, ‘Can Xeon read in a lot of data and processes it efficiently and quickly?’  We can easily show that the Xeon E7 family of processors can do this faster in benchmark tests than most proprietary RISC processors.


Higher is better

Where the question falls apart is in the premise, can a 64-bit Xeon address 20TB?  If a 64-bit RISC processor can address 20TB, then a 64-bit Xeon will as well.  No database is going to be read 20TB of data at a time and besides, an Oracle database is going to have a lot of space that is either empty or not used.  (For instance is there really 20TB of data or is it really 12TB or less?)  But the concern of the customer usually goes deeper.  So let’s break this issue down.

What is the number of users?  This is a useful question.  For instance is it a data warehouse with only a handful of users?  Or is it a highly transactional database with thousands of users?  In either scenario Xeon is great.  (In 2008 and 2009 I was a DBA for Oracle on a benchmark they were running of a 10TB medical database with between 10 and 20 thousand of users.  The Xeon processors for this benchmark were a number of generations ago.)

Another question that is maybe being asked is: ‘What is the largest data file I can create for my 20TB database?’  What I’ve found behind this question is a concern regarding the manageability of the database given the number of datafiles that would need to be created to get to 20TB.  (For that benchmark 3 years ago it took me all weekend to build a 10TB database with 1GB datafiles.  I had them spread out but there were an awful lot of them.  Today, with much faster I/O creating a 20TB database will be much faster.)

Another concern being raised by the original question would be memory addressability.  For large databases the thinking is that the datasets being processed in memory are very large.  Can Xeon address as much memory as a proprietary RISC processor?  In other words, can Xeon scale up?  Do the platforms sporting a Xeon e7 processor have the memory capacity as servers with a proprietary RISC processor?  We can easily demonstrate that Xeon will fill the bill by platform diversity from various vendords that can support 2TB to 6TB of RAM.

Another concern raised by the question might be on concurrent processing.  With a 20TB database a lot of the processing may utilize Oracle’s parallel query function.  The Xeon E7 family with its multiple core and hyper-threading technologies can easily handle significant parallel processing.  For example, I started running Oracle Parallel Query Option, PQO, in 1996 when the feature first came out and I was using a 24 processor Sequent server utilizing Pentium processors.

I imagine there are additional ways to break this question down but overall the question: "Can the Xeon E7 processor run a 20TB database?" deserves an answer that addresses the real issues.   The simple answer is a resounding YES!