1 Reply Latest reply on Nov 12, 2018 6:32 PM by yairi

    Hang occured when mount by glusterfs using driver 4.4.2 for MT27500 Family [ConnectX-3 on CentOS 7.1

    malfe

      hello,

      After I changed to driver from 4.1 to 4.4.2 on Centos 7.1, when I mount and umount glusterfs some times, the system was hanged.

      The below is the screen shot of console output and dmesg output when do mount.

      It is ok while using dirver 4.1, I don't know how to debug this problem and it blocked me for a long time。

      I need some advice to workround with this problem, thanks.

       

      rdma_20181108135914.png

       

      [Fri Nov 2 15:07:38 2018] WARNING: at /var/tmp/OFED_topdir/BUILD/mlnx-ofa_kernel-4.4/obj/default/drivers/infiniband/core/cma.c:666 cma_acquire_dev+0x268/0x280 [rdma_cm]()

      [Fri Nov 2 15:07:38 2018] Modules linked in: fuse ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype iptable_filter xt_conntrack nf_nat nf_conntrack br_netfilter bridge stp llc dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio loop bonding rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) mlx5_fpga_tools(OE) mlx5_ib(OE) mlx5_core(OE) mlxfw(OE) iTCO_wdt dcdbas iTCO_vendor_support mxm_wmi intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd ses ipmi_devintf enclosure pcspkr ipmi_si ipmi_msghandler wmi acpi_power_meter shpchp lpc_ich mei_me sb_edac edac_core mei ip_tables xfs libcrc32c mlx4_ib(OE) mlx4_en(OE)

      [Fri Nov 2 15:07:38 2018] ib_core(OE) sd_mod crc_t10dif crct10dif_generic mgag200 crct10dif_pclmul drm_kms_helper crct10dif_common crc32c_intel syscopyarea sysfillrect sysimgblt fb_sys_fops ttm mpt3sas raid_class drm scsi_transport_sas nvme ahci ixgbe libahci igb mdio libata ptp i2c_algo_bit mlx4_core(OE) i2c_core pps_core megaraid_sas devlink dca mlx_compat(OE) fjes dm_mirror dm_region_hash dm_log dm_mod zfs(POE) zunicode(POE) zavl(POE) zcommon(POE) znvpair(POE) spl(OE) zlib_deflate sg

      [Fri Nov 2 15:07:38 2018] CPU: 10 PID: 18958 Comm: glusterfs Tainted: P W OE ------------ 3.10.0-514.26.2.el7.x86_64 #1

      [Fri Nov 2 15:07:38 2018] Hardware name: Dell Inc. PowerEdge R730xd/0WCJNT, BIOS 2.4.3 01/17/2017

      [Fri Nov 2 15:07:38 2018] 0000000000000000 000000009310e9fa ffff8807f176bcf0 ffffffff81687133

      [Fri Nov 2 15:07:38 2018] ffff8807f176bd28 ffffffff81085cb0 ffff8810498cec00 0000000000000000

      [Fri Nov 2 15:07:38 2018] 0000000000000001 ffff88104f5a71e0 ffff8807f176bd60 ffff8807f176bd38

      [Fri Nov 2 15:07:38 2018] Call Trace:

      [Fri Nov 2 15:07:38 2018] [<ffffffff81687133>] dump_stack+0x19/0x1b

      [Fri Nov 2 15:07:38 2018] [<ffffffff81085cb0>] warn_slowpath_common+0x70/0xb0

      [Fri Nov 2 15:07:38 2018] [<ffffffff81085dfa>] warn_slowpath_null+0x1a/0x20

      [Fri Nov 2 15:07:38 2018] [<ffffffffa0aed1c8>] cma_acquire_dev+0x268/0x280 [rdma_cm]

      [Fri Nov 2 15:07:38 2018] [<ffffffffa0af214a>] rdma_bind_addr+0x85a/0x910 [rdma_cm]

      [Fri Nov 2 15:07:38 2018] [<ffffffff8120e5e6>] ? path_openat+0x166/0x490

      [Fri Nov 2 15:07:38 2018] [<ffffffff8168a982>] ? mutex_lock+0x12/0x2f

      [Fri Nov 2 15:07:38 2018] [<ffffffffa082c104>] ucma_bind+0x84/0xd0 [rdma_ucm]

      [Fri Nov 2 15:07:38 2018] [<ffffffffa082b71b>] ucma_write+0xcb/0x150 [rdma_ucm]

      [Fri Nov 2 15:07:38 2018] [<ffffffff811fe9fd>] vfs_write+0xbd/0x1e0

      [Fri Nov 2 15:07:38 2018] [<ffffffff810ad1ec>] ? task_work_run+0xac/0xe0

      [Fri Nov 2 15:07:38 2018] [<ffffffff811ff51f>] SyS_write+0x7f/0xe0

      [Fri Nov 2 15:07:38 2018] [<ffffffff81697809>] system_call_fastpath+0x16/0x1b

      [Fri Nov 2 15:07:38 2018] ---[ end trace c97345452e609a78 ]---

      [Fri Nov 2 15:07:38 2018] ------------[ cut here ]------------

      [Fri Nov 2 15:07:38 2018] WARNING: at /var/tmp/OFED_topdir/BUILD/mlnx-ofa_kernel-4.4/obj/default/drivers/infiniband/core/cma.c:666 cma_acquire_dev+0x268/0x280 [rdma_cm]()

      [Fri Nov 2 15:07:38 2018] Modules linked in: fuse ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype iptable_filter xt_conntrack nf_nat nf_conntrack br_netfilter bridge stp llc dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio loop bonding rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) mlx5_fpga_tools(OE) mlx5_ib(OE) mlx5_core(OE) mlxfw(OE) iTCO_wdt dcdbas iTCO_vendor_support mxm_wmi intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd ses ipmi_devintf enclosure pcspkr ipmi_si ipmi_msghandler wmi acpi_power_meter shpchp lpc_ich mei_me sb_edac edac_core mei ip_tables xfs libcrc32c mlx4_ib(OE) mlx4_en(OE)

      [Fri Nov 2 15:07:38 2018] ib_core(OE) sd_mod crc_t10dif crct10dif_generic mgag200 crct10dif_pclmul drm_kms_helper crct10dif_common crc32c_intel syscopyarea sysfillrect sysimgblt fb_sys_fops ttm mpt3sas raid_class drm scsi_transport_sas nvme ahci ixgbe libahci igb mdio libata ptp i2c_algo_bit mlx4_core(OE) i2c_core pps_core megaraid_sas devlink dca mlx_compat(OE) fjes dm_mirror dm_region_hash dm_log dm_mod zfs(POE) zunicode(POE) zavl(POE) zcommon(POE) znvpair(POE) spl(OE) zlib_deflate sg