Physical Address Memory Region

Version 9

    This post discusses Physical Address Memory Region usage.

    This post is meant for developers or advanced users, that have experience with libibverbs.

     

    References

     

    Overview

    The Physical Address Memory Region (PA-MR) allows the user to manage physical memory used for posting send and receive requests. This can benefit the performance of storage applications that register large memory regions with random access.

     

    Virtual Memory Regions

    Previously the user-space application would allocate a virtual memory region and register the memory using MR verbs. Registering a virtual memory region using the kernel manages the issues involved with RDMA and virtual addresses.

    When working with RDMA and virtual memory it is important to know that:

    • Virtual memory pages can be swapped and therefore should be pinned.
    • Translation to a physical address takes time to calculate.
    • Virtual memory is not continuous in physical memory and therefore each page should be translated.

     

    Physical Memory Regions

    When working with a physical address memory region (PA-MR) the memory region that is used must be:

    • Not swappable (pinned)
    • A continuous address space in physical memory

     

    One way of allocating physical memory is by using Linux huge pages, which reserve a non-swappable, continuous physical address space. Access to this memory can be done using the mmap system call.

    There are multiple methods of translating the returned virtual address from mmap to the physical address. Using a physical memory region and supplying the physical address when posting requests improves the random access performance, since no conversion from virtual address to physical address is needed.

     

    Security

    Note: When using PA-MR, users bypass the memory protection kernel mechanism, which might crash the system when used incorrectly. This feature is recommended for experienced users with an understanding of the possible risks.

     

    Enabling Physical Address Memory Region

    PA-MR is not enabled by default in MLNX_OFED for security reasons. MLNX_OFED sources should be recompiled using the following configuration flag:

      --with-pa-mr

     

    See also HowTo Compile MLNX_OFED Drivers  (mlnx-ofa_kernel example).

     

    Code Example

    Before you start, make sure that the MLNX_OFED was complied using the --with-pa-mr flag. For an example, refer toHowTo Compile MLNX_OFED Drivers  (mlnx-ofa_kernel example).

    1. Register a physical memory region from the user space using libibverbs as follows:

    struct ibv_exp_reg_mr_in in = {0};

     

    /* Set IBV_ACCESS flags */

    my_access_flags =   IBV_ACCESS_LOCAL_WRITE |\

                        IBV_ACCESS_REMOTE_READ |\

                        IBV_ACCESS_REMOTE_WRITE |\

                        IBV_ACCESS_REMOTE_ATOMIC |\

                        IBV_EXP_ACCESS_PHYSICAL_ADDR;

     

    /* Allocate a physical MR - allowing access to all memory */

    in.pd = pd;

    in.addr = NULL; // Address when registering must be NULL

    in.length = 0;  // Memory length must be 0

    in.exp_access = my_access_flags;

    physical_mr = ibv_exp_reg_mr(&in);

     

    Note: For security reasons only users who have CAP_SYS_RAWIO capabilities enabled can allocate PA-MR.

     

    2. Create a scatter gather element (sge) as follows:

    /* Set scatter gather element for work request */

    struct ibv_sge sge = {0};

    sge.addr = physical_address; // The physical address

    sge.length = SEND_LEN; // Send length from starting physical address

    Note: The inline-receive features requires the sge to contain a valid virtual address.

              Since this is not the case when a physical access MR is used, those features should not be used together.

              Please make sure that QPs that use the physical access MR do not have inline-receive enabled.

              inline-receive is disabled by setting the max_inl_recv field in the ibv_exp_qp_init_attr struct to 0

              when calling ibv_exp_create_qp.