10G Networking Upgrade: Intel X540 Problems
I decided it was time to upgrade the 10G networking in my lab, here is the story about the problems I faced.
My main ESXi box has a dual port Intel X540 10G NIC built in, but I don't have any 10GBaseT ports, so I have ran it at 1G for the longest time. The only 10G networking in the ESXi box is from a single port Mellanox ConnectX-2. A few months ago the retention clip broke on the card, meaning if I moved the server out on the rails, the SFP+ module would just slide out on its own.
I decided it was time to replace the card, but then I saw that Fiberstore (FS.com) sells 10GBaseT tranceivers for just $60! I ordered them, and a few OM3 patch cables so I could replace my old worn looking SFP+ DAC's and eventually just toss the Mellanox card.
The upgrade went well, and it mean't my ESXi box had 3 x 10G links until I took the card out. Awesome!.. for now
The setup looked great along with some Monoprice SlimRun Cat6a cables. It almost looks like singlemode fiber
The transceivers did run a little warm, but that was expected. But that was a sign of things to come...
As soon as I got ESXi booted back up, I got an email alert about a critical system issue. I logged in and saw the LAN temp in the Supermicro IPMI showing 115C. That can't be good.... Here you can see the instant jump in temp in LibreNMS
As you can see, the temperature had always been quite high, but airflow, air temperature and load seemed to have no effect on it, so I assumed it just ran hot and that's the way it was.
Immediatly I started figuring out ways to cool the chip. I found a picture of the Supermicro X9DRW-CTF31 board, and saw that the LAN chip seems to have no heatsink.
This seemed strange, because the add-in card version of the X540 has a HUGE heatsink, together with people reporting overheating issues
So I figured the fix would be to find a heatsink, and glue it on the chip in hopes I could use it.
I took apart a ton of old server gear and found an assortment of heatsinks, and bought some thermal adhesive online. I eventually turned the server off and opened the case, only to find there was ALREADY a heatsink on it! But, it was completely cold to the touch.
Supermicro support confirmed there was no heatsink on the board from the factory, so I am not too sure how this got on there. The server was a FireEye appliance originally, so perhaps they installed it? Either way something weird was going on.
It was attached with some very strong adhesive I couldn't remove, so I decided to just leave it alone. But I did remove a serial connector and a blanking plate that was directly behind the heatsink to try and improve airflow. But it did nothing.
I ended up doing my research, and finding that the Broadcom 57810 is the best value performance card for ESXi, so I ordered one for $30 and sent my two 10GBaseT transceivers back for a refund of $120. In hindsight, I should have just done this from the start
I installed the card, and everything just works. At least now everything is running SFP+ with OM3 fiber, which looks nice. The other device with 10G SFP+ is my Synology DS1817+ NAS. The next planned upgrade will be my pfSense box so I can do inter-VLAN routing at 10G without using the switch's routing capabilities
Thats all!