How Intel and IBM Did Big Data 148x Better

Though Intel and IBM have celebrated many achievements together during our rich history of co-engineering, our current collaboration—involving the Intel® Xeon® E7 v2 processors and IBM’s latest DB2 database technologies—is delivering unbelievable breakthrough results, especially with performance gains topping 148x beyond the previous generation of software and processors.

How did Intel and IBM generate such dramatic performance improvement?

To find out and gain a better understanding of how collaboration flows between IBM and Intel, I went behind the scenes and talked with Jantz Tran, an Intel performance application engineer who works closely with IBM DB2 development.

In fact, Jantz works so closely with IBM that he has an office at IBM’s Silicon Valley Labs—He basically embodies the collaboration between the two companies, working directly with IBM developers to ensure that Intel technologies map to DB2 database development, and vice versa.

“My team assists the IBM dev groups by answering any technical questions they may have about Intel processors,” says Jantz. “For instance, I help ensure DB2 software can take advantage of the parallelism and vectorization support built into the most recent Intel Xeon processors. I also work with the DB2 performance team to set up and tune software and hardware for analysis and benchmark testing on joint IBM and Intel platforms.”

Some of Jantz’s most exciting recent projects involved aligning the new columnar database format in IBM DB2 with BLU Acceleration* with new instruction sets and vectorization support in Intel Xeon E7 v2 processors.

Columnar data processing is a much faster technology for scanning massive data sets and performing analytical querying, particularly when supported by Intel® Advanced Vector Extensions (Intel® AVX) and SSE (Streaming SIMD Extensions) instructions in Xeon E7 v2 processors. This enables DB2 to pack more data elements into the register of a single processor and divide query processing into multiple threads that work simultaneously.

DB2 with BLU Acceleration is a re-architecting of IBM’s database platform, adding columnar store to existing row-based data store,” says Jantz. “This allows the technology to take advantage of Intel AVX and SSE instructions to tap into the massively increased performance potential of highly parallelized, multicore processing.

“This is where you get those really big, 148x performance improvements. In the benchmark, just upgrading from previous generation IBM DB2 10.1* to IBM DB2 with BLU Acceleration increased workload speed 77x. Upgrading to Intel Xeon E7 v2 processors from previous generation chips doubled the performance.”

Jantz says running DB2 with BLU Acceleration on Intel Xeon E7 v2 processors lets you take advantage of actionable data-compression features and make much more effective use of system memory.

“Packing columnar data into SSE registers allows you to use memory pools much more efficiently than row-based stores,” he says, “because you can run queries and evaluate data while it is still compressed. In fact, data compression with columnar store is so much more efficient it requires a lot less memory to run the same data set. So you can house a much larger columnar database on a much smaller memory footprint.”

For example, in benchmark tests that Jantz helped engineer, running 10 TB of raw data through the previous generation IBM DB2 10.1 resulted in a row-base database size of 9.69 TB. (That means to run 10 TB of data in-memory required about 10 TB of memory.) However, running the same 10 TB of raw data through DB2 with BLU Acceleration with columnar store and data compression required only 2.13 TB.

In other words, the same data was 4.55x smaller with DB2 with BLU Acceleration using actionable compression than with DB2 10.1 using static compression!

“So if you have 10 TB of raw data and 2 TB of memory, you can run it as an in-memory database using DB2 with BLU Acceleration and Intel Xeon E7 v2 processors,” says Jantz. “The bottom line: These technologies allow you to run large primary databases directly in-memory at orders-of-magnitude improved performance.”

These dramatic performance achievements are groundbreaking, even considering over 15 years of collaboration and joint engineering between Intel and IBM, with generation-after-generation of improvements in performance, up-time, and reliability.

Want to learn more about how IBM DB2 with BLU Acceleration and Intel technologies work together to deliver on the promise of Big Data?

And to discover how to unlock the value of your own data with Intel and IBM innovations read the white paper. Both offer insight into what are truly amazing accomplishments in this IBM-Intel collaboration.

Follow Tim and the Big Data community for Intel at @TimIntel.

Published on Categories Big DataTags , , ,
Tim Allen

About Tim Allen

Tim is a strategic relationship manager for Intel driving enablement for enterprise software companies related to the cloud, big data, analytics, AEC, commercial VR, datacenter, and IoT. Tim has 20+ years of industry experience including work as a systems analyst, developer, system administrator, enterprise systems trainer, product marketing engineer, and marketing program manager. Prior to Intel Tim worked at IBM, Tektronix, Intersolv, Sequent and Con-Way Logistics. Tim holds a BSEE in computer engineering from BYU and an MBA in finance from the University of Portland. Specialties include - PMP, MCSE, CNA, HP-UX, AIX, Shell, Perl, C++