A Deep Dive into AMD/Xilinx AXI Bridge for PCI Express (AMD/Xilinx PG194) and Why We Tweaked C_M_AXI_NUM_READQ

A Deep Dive into AMD/Xilinx AXI Bridge for PCI Express (AMD/Xilinx PG194) and Why We Tweaked C_M_AXI_NUM_READQ Executive Summary AMD/Xilinx’ AXI Bridge for PCI Express (PG194) implements a bi-directional communication channel from and to FPGA internal memory mapped AXI4 masters and slaves to and from external PCIe connected memory mapped devices, with the FPGA operating as PCIe endpoint or root port. In many scenarios the performance of forwarding communication between the two protocols, AXI4 and PCIe, is sufficient and the AMD/Xilinx IP core can be used as well. However, in certain cases tweaking is necessary to achieve the expected throughput. Depending on the amount of extra performance required the modification ranges from simply tuning a hidden parameter to patching the IP’s HDL sources. In the example project used for this description the PCIe peer to peer (P2P) write performance from an FPGA to a 12 NVMe SSD RAID0 increased from 2.700 MiB/s to 4.900 MiB/s to 8.600 MiB/s. AMD/Xilinx AXI Bridge for PCI Express Overview The AMD/Xilinx AXI Bridge for PCI Express is implemented differently for different AMD/Xilinx FPGA families. This description focuses on the “AMD/Xilinx DMA//Bridge Subsystem for PCI Express in AXI Bridge mode” implementation as found in AMD/Xilinx UltraScale+ devices. The bridge IP core comprises the actual bridging logic, converting

Picking The Right Granularity When Buffering PCIe/NVMe Data

Picking The Right Granularity When Buffering PCIe/NVMe Data You know our Mission: If It Is Packets, We Make It Go Faster – today the many flavors of memory for buffering data in FPGAs. Non-Volatile Memory Express (NVMe) is an interface specification often used with PCIe. Its goal is to leverage the parallelism and low latency of modern SSDs. A typical PCIe payload data transfer happens in data chunks of either 128 Byte or 256 Byte. SSDs deploy several tricks (wear leveling, SLC to TLC conversion) to enhance the read and write speeds as well as their lifespan. One downside is that their read and write speed is not constant over a long write/read period which might result in backpressure. Some applications do not support back pressure that can lead to an erroneous state if one employs a standard SSD system.   One possible mitigation strategy is to have an elastic buffer between the SSD and the data source.  Using an FPGA, there are different possibilities to implement an elastic buffer. At MLE, we investigated BlockRAM (BRAM), UltraRAM (URAM), Dynamic RAM (DRAM) and the second generation of High Bandwidth Memory  (HBM2). Each memory technology has its advantages and disadvantages regarding its capabilities to handle different data chunk sizes.  We will present our findings below. 

Please fill in the form below, so we can give you access to the Remote Evaluation System.

    NPAP-10G Remote Eval.NPAP-25G Remote Eval.