Sizing Compute Needs in the Data Center

'We're gonna need a bigger boat' - Jaws

How to answer: "How big a box do I need?"

The number 1 request I hear from customers looking to move from RISC systems to IA is sizing.   What size box do I need?  This usually translates out to: ‘How many CPUs do I need?’

Most professionals understand how much disk space they’ll need.  They also have a good idea of the amount of memory that the application will require because it is already running on the legacy platform.  But the CPU difference is the wild card.  I know memory speed and disk types, disk technology and attachment methods all affect this but I’ll address other this and other considerations related to the target server in a future post but here I want to talk about the CPU.

Generational differences make a big impact.  Intel currently sells Xeon processors using 32nm and it is likely that the RISC processor is at 250nm or 90nm.  Intel sells Xeon processors that have two processing threads per core and the RISC processor could have up to 4.  There are processor speed differences but instruction pipelines and the number of instructions per clock cycle are a factor here.  How does this factor in?  It can be overwhelmingly complex.

So in desperation users turn to benchmarks to try to make the comparisons.  But what benchmarks should be considered?  Today there are so many benchmarks that this too is confusing and the benchmarks keep changing with new generations.  Some benchmarks are dropped because they gave an unusual advantage to one vendor or another.   And as technology advances with new application systems the benchmark councils add new benchmarks to address this.  For instance SPEC went from Integer and floating point testing to adding Java and Power consumption benchmarks.

Sizing tools are available on various vendors’ web sites but these are often for a new installation of a known application system, for instance such as SAP components or Oracle eBusiness Suite applications.

But you want to move an existing application system or systems from the RISC server to the IA server.  This is different.  You want to know how YOUR system runs on the new systems.

Sizing tools provide an advantage here.  These tools measure how your system is running on the current platform, like a capacity planning tool would, but then provide an estimate of what the load would be on a new target system.

The process goes something like this:

  • Install agents in the existing servers making up the application system
  • Collect data for a period of time representative of a normal business cycle for the application, usually a month but can be as long as a quarter.
  • Configure the sizing tool to account for performance requirements like your SLAs and other availability requirements and targeted Xeons.
  • Let the sizing tool ‘do its thing’ and process the accumulated data into a report.
  • You now have a rough estimate of the size of the system you need to buy or provision in a virtualized environment.

How does this tool know how many cores you’ll need in the new server?  I have had that question too.  In pulling back the covers on a few of these tools I find that they are using SPECint to compare processors.  On UNIX systems the agents are often just grabbing the sar data available to anyone.

So if you want to do your own sizing by using the sar data that is already available to you.  You then need to perform some minor ETL to get it to produce the graphs you want in your favorite spreadsheet program.  The you’ll need to pull down the SPECint values from the SPEC website.  You should be able to get something of an approximation after some work on your part.  For instance your RISC system has 4 cores but the only servers tested with the SPEC benchmark are 6 core systems of a different bin (or speed) of your system or a 4 core of the next generation of your RISC processor.   Some of the sizing tools have algorithms to do this interpolation for you.

After all of this, you can now buy your test system where you can load the backup of the application and test it against your careful sizing effort.

We’ll talk more about the loading of the backup in the near future.