Intel Processors and FPGAs—Better Together

Introducing the Intel® Xeon® Scalable processor with integrated Intel® Arria® 10 FPGA

I'm proud to share that the Intel® Xeon® Scalable processor with integrated Intel® Arria® 10 field programmable gate array (FPGA) is now available to select customers. This marks the first production release of an Intel® Xeon® processor with a coherently interfaced FPGA—an important result of Intel’s acquisition of Altera. The combination of these industry-leading FPGA solutions with Intel’s world-class processors enables customers to create the next generation of data center systems with flexible workload-optimized performance and power efficiency.

Intel® Xeon® Scalable processor with integrated FPGA

The Intel® Xeon® Scalable Processor 6138P includes the Intel® Arria® 10 GX 1150, which provides up to 160Gbps of I/O bandwidth per socket and a cache-coherent interface for tightly coupled acceleration. The Intel® Arria® 10 GX 1150 has its own cache and shares memory with the processor via low-latency, cache coherent access over the Intel® Ultra Path Interconnect (Intel® UPI) bus. Unlike other system interface bus standards, Intel® UPI allows seamless access to data regardless of where the data resides (core cache, FPGA cache, or memory) without the need for redundant data storage and direct memory access (DMA) transfers. Data coherency also reduces application programming complexity and saves CPU cycles that would be wasted to determine which data is most-up-to-date.

A great example of this system capability is Intel’s new virtual switching reference design for the Intel® Xeon® Scalable processor with integrated FPGA. This reference design uses the FPGA for infrastructure dataplane switching, while the processor does application processing or processes virtual machines. This helps simplify network complexity and improve the productivity of the processor.

This solution is also compatible with the Open Virtual Switch (OVS) framework and delivers a dramatic 3.2X throughput improvement at half the latency and 2X more VMs as compared to OVS running on an equivalent processor without FPGA acceleration.1 Additionally, code compatibility with Intel’s OVS-DPDK software makes data center retrofits simple and scalable to optimize operational agility.

Intel® Xeon® Scalable processor with integrated FPGA

Fujitsu, a lead partner, plans to deliver systems based on the Intel® Xeon® processor with integrated FPGA and Intel’s OVS reference design. They are making the Intel® virtual switching reference design even more robust for the networking environment through their reliability, availability, and serviceability (RAS) with performance monitoring and debug assisting functions. This solution is being demonstrated this week at the Fujitsu Forum in Tokyo.

FPGAs continue to be an important part of Intel’s portfolio of workload-optimized solutions for the data center. Going forward, we will continue to improve ease of use of Intel® FPGAs and other accelerators in the datacenter. To provide our customers with greater deployment flexibility, Intel’s future roadmap will introduce a discrete FPGA solution with faster coherent and increased high-bandwidth interconnect enabled by the Acceleration Stack for Intel® Xeon® CPU with FPGAs. It will support code migration from the Intel® Xeon® Scalable processor with Integrated FPGA and the Intel® Programmable Acceleration Card (Intel® PAC) solutions, and will continue to be optimized for enhanced bandwidth and low latency.

To learn more about the new Intel® Xeon® Scalable processor with integrated FPGA, visit www.intel.com/accelerators.

 


1 The Intel® Xeon® Gold 6138P processor with Integrated Arria® 10 GX 1150 FPGA delivers up to 3.2X throughput with half the latency and 2X more VMs when compared to Intel® Xeon® Gold 6138P processor with software OVS (Open Virtual Switch) DPDK forwarding in the CPU user space application. Configuration: 2x Intel® Xeon® Gold 6138P processor with Integrated Intel® Arria® 10 GX 1150 FPGA on Blue Mountain Pass (BMP) platform, 12 x 16GB Micron 2Rx8 DDR4 2666MHz (192GB total), 240GB Kingston SSD, 1xPCI-E 3.0 x8 slot and 1xPCI-E 3.0 x10 slot, Network NICs:1x 100G Alaska NIC and 2 x Intel® Ethernet Network Adapter XXV710-DA2 (25GbE NIC) (fw 5.50.47059 api 1.5 nvm 5.51 0x80002bf8 1.1568.0), Operating System: Ubuntu-16.04.3, OS Kernel: 4.4.0-116-generic, Bios: SE5C620.86B.01.00.0813.041020180320 (Release Date: 04/10/2018), uCode: mb750654_02000043, FPGA BBS v6.4.0_Production (GBS 6.4.0, OPAE-0.12.1, Lib switch OPAE ver 1.1), VM opearting system: Ubuntu 17.10, OS Kernel: 4.13.0-31-generic, Benchmark: Open vSwitch 2.9.0, Compile ver: DPDK with "-Ofast", OVS with ./configure –with-dpdk=<DPDK SDK PATH>/x86_64-native-linuxapp-gcc make CFLAGS=“-Ofast -march=native”, Other software: DPDK version 17.11. Compared to 2x Intel® Xeon® Gold 6138P processor on Blue Mountain Pass (BMP) platform, 12 x 16GB Micron 2Rx8 DDR4 2666MHz (192GB total), 240GB Kingston SSD, 1xPCI-E 3.0 x8 slot and 1xPCI-E 3.0 x10 slot, Network NICs:1x 100G Alaska NIC and 2 x Intel® Ethernet Network Adapter XXV710-DA2 (25GbE NIC) (fw 5.50.47059 api 1.5 nvm 5.51 0x80002bf8 1.1568.0), Operating System: Ubuntu-16.04.3, OS Kernel: 4.4.0-116-generic, Bios: SE5C620.86B.01.00.0813.041020180320 (Release Date: 04/10/2018), uCode: mb750654_02000043, VM opearting system: Ubuntu 17.10, OS Kernel: 4.13.0-31-generic, Benchmark: Open vSwitch 2.9.0, Compile ver: DPDK with "-Ofast", OVS with ./configure –with-dpdk=<DPDK SDK PATH>/x86_64-native-linuxapp-gcc make CFLAGS=“-Ofast -march=native”, Other software: DPDK version 17.11.

The benchmark results may need to be revised as additional testing is conducted. The results depend on the specific platform configurations and workloads utilizedin the testing, and may not be applicable to any particular user's components, computer system or workloads. The results are not necessarily representative of other benchmarks and other benchmark results may show greater or lesser impact from mitigations. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors.  Performance tests, such as SYSmark* and MobileMark*, are measured using specific computer systems, components, software, operations and functions.  Any change to any of those factors may cause the results to vary.  You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance.