https://blogs.intel.com/psg/gen-z-consortium-demos-5x-sqlite-database-acceleration-at-sc19-with-an-intel-upi-link-between-an-intel-xeon-cpu-and-an-intel-fpga/
Last week at the Supercomputing 2019 (SC’19) conference held in Denver, the Gen-Z Consortium demonstrated a proof-of-concept (POC) SQLite database acceleration demo that showed a 5x performance improvement in the average database INSERT operation time, with a clear path to even better performance. The demo employed a Gen-Z DRAM Memory Module (ZMM) with an integrated memory controller connected over the Gen-Z fabric to an Intel FPGA. The Intel FPGA communicates with an Intel® Xeon® CPU over a coherent, low-latency Intel® Ultra Path Interconnect (UPI) link using an Intel UPI Home Agent IP block contributed by Intel specifically for this POC demo. The Intel FPGA thus acts as an Intel-UPI-to-Gen-Z bridge, as shown in this block diagram:
The demo’s figure of merit is the average time for a SQLite database INSERT operation, comparing performance with a local attached SSD versus performance using ZMMs connected over a Gen-Z fabric to the Xeon CPU. As the results screen for the POC demo shows (see below), the average INSERT operation targeting an SSD is between 4 and 4.5 milliseconds – shown by the gray line towards the top of the screen – while the average INSERT operation time when targeting Gen-Z DRAM modules is around 750 microseconds, as shown by the orange line near the bottom of the screen. That’s better than a 5x performance increase just for this early POC demo.
Both sets of INSERT operations, with the SSD and with the Gen-Z DRAM modules, use precisely the same application code. Code modification was not needed to redirect access to the ZMMs. Instead, there’s a small software shim added to the operating system that translates file-system calls made by the application program into ZMM memory accesses. Much of the performance improvement can be attributed to the bypassing of the OS’s file system and the ZMMs’ inherent speed advantage over an SSD, but some of the performance boost arises from use of the Intel Xeon CPU’s coherent, low-latency Intel UPI port.
Moving forward, this POC demo and the IP used will migrate to the recently announced Intel® Stratix® 10 DX FPGA, which supports the Intel UPI protocol as well as PCIe Gen4 x16. (See “Talk to PCIe Gen4 x16, Intel® UPI, Intel® Optane™ DC Persistent Memory, and SDRAM with one Intel® Stratix® 10 DX FPGA.”) According to Erich Hanke, Principal Engineer for Storage Products at IntelliProp, there’s a clear path to getting even more SQLite performance in future iterations of this demo. Additionally, this performance improvement will likely be directly applicable to other database applications such as MongoDB and Oracle Database.
Also, note that the next evolutionary step for coherent CPU-to-device interconnect is the Compute eXpress Link (CXL). The CXL Consortium – an open industry standard group initially formed by Alibaba, Cisco, Dell EMC, Facebook, Google, Hewlett Packard Enterprise, Huawei, Intel Corporation and Microsoft – already consists of 70 member companies and the CXL specification, version 1.1, is already available for evaluation. Members of the recently announced Intel® Agilex® FPGA family support the CXL interconnect. (See “How do the new Intel Agilex FPGA family and the CXL coherent interconnect fabric intersect?”)
For more information about the Gen-Z IP discussed in this blog, please contact IntelliProp directly.
————————————————
About IntelliProp – IntelliProp, Inc. develops ASSP Products, licensable IP cores and highly integrated IP Products for Memory and Data Storage applications. Areas of significant expertise include SATA, SAS, PCIe/NVMe, Gen-Z, NVDIMM and RAID technologies. Headquarters, sales office, and design center are located in Longmont, CO. Please visit our website: https://intellipropipcores.com or contact IntelliProp at contact@intellipropipcores.com or (303) 774-0535.