I did the tests mentioned in the fourth bullet. With confirmed QDR working IB cards and cables and the IS5031, the cards negotiated to 20 Gb/s rate, so it must be something with the switch. I also tried the QLE7340 with the Sun switch (which I've confirmed works at 40 Gb/s). Interestingly, the QLE70340 negotiated to 10 Gb/s rate. So something is screwy with both the switch and with the QLE cards. Maybe something to do with firmware mismatching?
I tried two QLE7340's with the Sun switch, and they both negotiated to 10 Gb/s. They negotiate to 20 Gb/s with the IS5031. Summary of tests:
- Two QLE7340, IS5031 switch, opensm : both negotiate to 20 Gb/s
- HP rebranded mellanox HCAs, Sun switch with internal SM : 40 Gb/s
- HP rebranded mellanox HCAs, IS5031 switch, opensm : 20 Gb/s
- QLE7340 and HP rebranded mellanox HCA, Sun switch with internal SM : 40 Gb/s for HP card, 10 Gb/s for QLE7340
- Two QLE7340, Sun switch with internal SM : both 10 Gb/s
I did a lot of googling. I've learned a little about PSM vs. verbs. Qlogic/Intel Infinipath uses PSM, and Mellanox/everyone else uses verbs. I know that when using MPI, you have to use one or the other, and that using Qlogic/Intel HCAs with verbs is slower. I've read some posts that say that you can't mix qlogic and other brands, and some posts that say you can. But I don't really understand it. Anyone have any insight on this?
I also learned that the Qlogic HCAs don't have firmware and that you're supposed to use the True scale fabric suite (OFED+ is the free version I guess?) from Intel. I'll give that a try next.
Still haven't found anything about the two licenses or the firmware for the IS5031. I sent a support inquiry...I don't have a warranty or anything, but we'll see.
Progress! I contacted Mellanox support through their support website. Very fast response.
If your switch was sold after 2012, it should have the FabricIT/Subnet Manager license on the underside of the pull out tab. If it was not, you need to contact technical support with your switch's serial number and they can check for it internally. UPDATE: you'll likely have to contact sales instead...there's only a small chance that technical support has a license registered for your switch if it's not on underside of pull out tab. UPDATE UPDATE: Do not contact sales. see later reply
They said that when you upgrade FabricIT, which runs the subnet manager, that the firmware also gets upgraded automatically. The last FabricIT version (you can see your switch's version using the "show version" command) is 1.1.3004, and the hardware is end of life, so if you have that version then you don't need to do any upgrades. Mine is 1.1.2500, so I need to upgrade it.
Process for upgrading FabricIT software and firmware of IS5030 or IS5031 IB switches:
- Download latest firmware image. This is the link they gave me, but it might be temporary. Here it is on my google drive.
- Starting from Section 5.1, follow the instructions in “FabricIT Enterprise Fabric Management Software User Manual For EFM Rev 1.1.3004”. (This is the link they gave me, link on my google drive.). This assumes you can already ssh into the switch. If not, read the Installation Guide (available online) or earlier sections of that User Manual for how to setup the ethernet port.
- I had to modify the given steps slightly. Note that I configured my switch with a static IP of 192.168.0.3. Here are my modifications:
- Since I didn’t set up a user or password, the ssh login was “ssh -l admin 192.168.0.3”.
- I didn’t have any images available to be installed, so I skipped the deletion step.
- If you are using a modern linux system to ssh into the switch in order to scp from the switch, you will receive an error “no kex alg”. The is because some of the kex algorithms have been found to have security vulnerabilities since the time the switch was made, and so have been excluded. You need exit out of the switch ssh, then append the following line to your /etc/ssh/sshd_config: “KexAlgorithms diffie-hellman-group1-sha1,ecdh-sha2-nistp256,ecdh-sha2-nistp384,ecdh-sha2-nistp521,diffie-hellman-group-exchange-sha256,diffie-hellman-group14-sha1” (without quotes). Then restart sshd. Then log back into the switch. You should remove that line when you’re done with scp and restart sshd.
- My scp line looked like “image fetch scp://<Linux username>@192.168.0.1/home/<Linux username>/Downloads/image-PPC_M405EX-EFM_1.1.3004.img”, where the IP address is the IP address of your computer's ethernet adapter. It then prompted me for a password and I input my Linux account’s password. This worked: "show images" showed an image ready to be installed.
- I wasn’t prompted to save configuration.
- After reload, I had to wait about 5 minutes in order to ssh back into it. After running “show version”, the new firmware was present.
- All other steps were the same as in the User Manual.
As for the upgrade to 36 ports license, they directed me to email@example.com. I sent them an email, so we'll see... Also still waiting on the subnet manager license. The HCAs that were confirmed to work at QDR with the Sun switch are still only operating at 20 Gb/s with the IS5031 switch with opensm running on one of the nodes. UPDATE: Don't contact sales. See later reply.
Did you manage to get a subnet manager license? I have contacted support and sales and both said that they cannot provide me a license for the subnet manager as the switch was not purchased through a certified partner.
Not yet. They said they would provide me with one, but they haven't yet.
I'll keep bugging them and update the post as things develop.
I'd love to get the 18 to 36 port license upgrade for cheap...I'll try to
talk them into it since the switches are discontinued.
Ok, summary of progress:
1. Firmware update successful, details of process is listed above. Mellanox technical support is awesome.
2. Technical support cannot give me a subnet manager license, and it was not written on the underside of my pull out tab. They directed me to sales. So if you need a license, contact sales.
3. I'm in contact with sales about both the subnet manager license and 18->36 port upgrades. They said that since the switches are EOL, they don't have these licenses listed for sale. They said they are working on possibly releasing them for free or a reduced price, because the original price was "much more" than what you can buy these switches for now (eBay, ~$200 with all licenses).
Will update again when I've heard back from them.
Not much progress on the speed issues unfortunately. I'm guessing the speed problems with the IS5031 have to do with the lack of an internal subnet manager license. I've been working with Intel technical support to see if the QLE7340's are having problems with both switches.
Ok, last reply on this topic. It's not 100% solved, but it's better.
See above for how to upgrade the firmware for the IS5031/IS5030. That's easy and solved.
The speed issues are a separate problem: I will create a second post about them, so ignore all of that here. Basically, it turns out that some of these IS50XX are actually DDR switches...be careful when buying.
The license problems are trickier. If you have the license(s) written on the bottom of the pull out tab, you're golden. If you don't, technical support can't help you, though. I tried contacting sales, but Mellanox's official policy (despite my begging) is to not support EoL hardware. Apparently generating the licenses for both the 36 port enable and the FabricIT (which allows you to use the subnet manager) from the serial number is a pain. They won't sell them to you, and have no plans for making them available. If you do manage to get one somehow, the FabricIT user guide shows how to install them, and that's very easy to do. Thus, I STRONGLY SUGGEST YOU ONLY BUY THE SWITCHES THAT HAVE THE LICENSES INSTALLED ALREADY. Ask the seller, make very sure that they come with both the 36 port enable license and the FabricIT (internal subnet manger) licenses. Offer to send them the console cable and instructions on how to check, whatever it takes. And when you get your switch, do NOT perform a hardware reset. First, go into console-> config mode, and type "show licenses" and write down the license keys (I think they're displayed there in full...you should be able to find them somewhere in there). They are long strings of letters/numbers and dashes. Then you can perform a hardware reset if you need to.
Hope all of this helps someone in the future...