0 Replies Latest reply on Aug 4, 2016 6:29 AM by ramakris

    mlx4 interrupt carsh observed in ARM

    ramakris

      Hi,

      I have connected one MLX4 X86 machine and other MLX4 ARM board. Both are connected as point to point.

      I have sent one command from X86 to ARM. after sending command ,x86 will expect response. but X86 is crashing with fallowing trace log. Any clue on this. same thing it worked for us on X86-X86 with out any issue

      [ 1037.952540] ------------[ cut here ]------------

      [ 1037.952544] WARNING: CPU: 3 PID: 0 at kernel/irq/handle.c:147 handle_irq_event_percpu+0x18c/0x1a0()

      [ 1037.952547] irq 39 handler mlx4_msi_x_interrupt+0x0/0x20 enabled interrupts

      [ 1037.952548] Modules linked in: nvmeof_host(OE+) mlx4_ib ipt_MASQUERADE xt_CHECKSUM ip6t_rpfilter ip6t_REJECT ipt_REJECT xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw iptable_filter ip_tables sg ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iTCO_wdt iTCO_vendor_support dcdbas x86_pkg_temp_thermal coretemp kvm_intel kvm snd_hda_codec_hdmi crc32c_intel ghash_clmulni_intel aesni_intel aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd snd_hda_codec_realtek microcode snd_hda_codec_generic

      [ 1037.952567]  usb_storage pcspkr serio_raw snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd sb_edac edac_core soundcore lpc_ich i2c_i801 mfd_core shpchp acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd sunrpc uinput autofs4 xfs libcrc32c exportfs sd_mod sr_mod cdrom nouveau video mxm_wmi drm_kms_helper ttm drm igb mpt2sas e1000e dca i2c_algo_bit xhci_hcd ahci i2c_core raid_class libahci scsi_transport_sas wmi dm_mirror dm_region_hash dm_log dm_mod ipv6

      [ 1037.952584] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G           OE  3.17.0perry #1

      [ 1037.952585] Hardware name: Dell Inc. Precision T7610/0NK70N, BIOS A08 07/24/2014

      [ 1037.952586]  0000000000000009 ffff88206fc63e18 ffffffff815eddb7 ffff88206fc63e60

      [ 1037.952587]  ffff88206fc63e50 ffffffff8104e6cd ffff882061c050c0 0000000000000027

      [ 1037.952588]  0000000000000001 0000000000000000 0000000000000000 ffff88206fc63eb0

      [ 1037.952590] Call Trace:

      [ 1037.952590]  <IRQ>  [<ffffffff815eddb7>] dump_stack+0x45/0x56

      [ 1037.952596]  [<ffffffff8104e6cd>] warn_slowpath_common+0x7d/0xa0

      [ 1037.952597]  [<ffffffff8104e73c>] warn_slowpath_fmt+0x4c/0x50

      [ 1037.952611]  [<ffffffff81427a10>] ? mlx4_interrupt+0x90/0x90

      [ 1037.952613]  [<ffffffff8109d24c>] handle_irq_event_percpu+0x18c/0x1a0

      [ 1037.952614]  [<ffffffff8109d297>] handle_irq_event+0x37/0x60

      [ 1037.952615]  [<ffffffff8109fe4f>] handle_edge_irq+0x6f/0x120

      [ 1037.952618]  [<ffffffff81004c9f>] handle_irq+0xbf/0x150

      [ 1037.952620]  [<ffffffff8106adda>] ? atomic_notifier_call_chain+0x1a/0x20

      [ 1037.952622]  [<ffffffff815f6e6f>] do_IRQ+0x4f/0xf0

      [ 1037.952623]  [<ffffffff815f52aa>] common_interrupt+0x6a/0x6a

      [ 1037.952623]  <EOI>  [<ffffffff814dc88c>] ? cpuidle_enter_state+0x7c/0x170

      [ 1037.952627]  [<ffffffff814dca37>] cpuidle_enter+0x17/0x20

      [ 1037.952629]  [<ffffffff8108834d>] cpu_startup_entry+0x29d/0x340

      [ 1037.952631]  [<ffffffff810b77c8>] ? clockevents_config_and_register+0x28/0x30

      [ 1037.952633]  [<ffffffff81030d53>] start_secondary+0x1b3/0x260

      [ 1037.952634] ---[ end trace 18611a6ee07902a7 ]---