The post is meant for IT architects and network administrators who wish to understand the reason behind NVMe over Fabric technology.
- What is RDMA?
- HowTo Configure NVMe over Fabrics
Non-volatile Memory Express over PCI Express (NVMe), an industry consortium comprised of the major storage industry vendors has been created to address the legacy storage stack shortcomings. By redesigning and standardizing the interface, an efficient programming interface for accessing NVM devices that are PCIe bus connected is being developed.
The main goals for the new programming interface include:
- Lock-free multi thread/process NVM access – These are implemented by MSI-X interrupt steering and the use of multiple queues for work submission and completion.
- Simple command set - Only 13 commands are required, as compared to more than 200 commands with the legacy storage interfaces. Each command has a fixed request size of 64B.
- Up to 64k queues per NVM controller and up to 64K commands per queue to enable parallelism within the process and prevent head of the line blocking while processing commands with variable latency.
To provide acceptable data storage times, the industry has been working to develop innovative solutions for accessing high capacity disk drives with low latency, high bandwidth, and minimal CPU involvement. Such an approach greatly reduces the need for using the CPU for storage processing, freeing it for application use. Storage providers have chosen the PCIe bus as the physical interface to storage devices for the following reasons:
- Good latency characteristics - PCIe devices are connected directly to the CPU
- Scalable performance - PCIe lane architecture enables the NVM design to support 1GB/sec, 4GB/sec, and higher
- Low power features - PCIe power features support multiple link power states to reduce the system power consumption
In the process of moving to storage on PCIe, SSD providers developed proprietary drivers to maintain compatibility with legacy software.
As shown in the figure, recent innovations in storage media controllers and media, have decreased the latency of accessing non-volatile memory device. The new and the future generations reduce this latency thus making the software the one responsible for high percentage of the read latency.
NVMe is designed to work over a PCIe bus. Legacy storage stacks for accessing storage device over the network could be used to operate NVMe devices. However, their synchronizations and command translation requirements defeat the benefits of NVMe devices for remote access. New software stacks are needed for taking advantage of the new and efficient architecture.
The NVMe over Fabrics is the protocol used for transferring NVMe storage commands between the client nodes over InfiniBand or Ethernet networks using RDMA technology to the target nodes.
Note: The name Fabrics is used for RDMA capable networks, either InfiniBand or (RDMA over) Ethernet.
NVMe over Fabrics aims to standardize the wire protocol and drivers for efficient access over RDMA capable networks with minimal processing required by the target node.
NVMe over Fabrics is designed to be a lightweight application that runs above the RDMA standard interface, VERBs.
The simplicity of the standard along with Mellanox high performance solutions for RDMA transport can be great ground for innovation.
To learn how to configure NVMe Over Fabrics, refer to HowTo Configure NVMe over Fabrics.