Sunday, March 7, 2010

Hyper-V R2 Upgrade Problems When Using Broadcom NetXtreme II 5708 NICs Bound to Virtual Switches

I have been very busy over the last couple of months upgrading Hyper-V hosts to R2. For the most part, these upgrades have been uneventful, without any issues at all, which is always a good thing. However, I said "for the most part", so lets talk about what happened?

Early in the upgrades, I was seeing some random issues with Virtual Switches losing their bindings with the physical NIC. This caught my attention, but after poking around and with no issues being evident, I just recreated the virtual switch and moved on. However, as I started working more and more larger scale upgrades I started to see a concerning trend. I noticed any Virtual Switch that was bound to a Broadcom NetXtreme II 5708 was losing its bindings. I decided to see if I could reproduce this in a lab and found that I could reproduce this easily.

I am sure there are those of you out there wondering why I would even be using Broadcom NIC's for my virtual networks instead of the good old dependable Intel that we all know and love. The answer is simple. I run all Dell in my data center and all Dell PowerEdge servers come with Broadcom NIC's on-board. I am running some pretty dense Hyper-V configurations with one cluster running over 12 virtual networks, so I need all the NIC's I can get, so not using the four on-board Broadcom's in this configuration wasn't an option. Well, at least it wasn't before, but that has changed.

I decided to reach out to my friends over at the Microsoft Product group and see if they had heard of this problem internally or from feedback for other customers. Apparently, this is a recent issue that has been discovered and is now officially documented as a known issue. Currently, Microsoft doesn't have a work around for this issue and nor is there an ETA for resolution. After all, it may not be Microsoft, but could very well be Broadcom.

So, what do we do? Well, in my configurations I can't afford these little gotchas and I will be working only with my trusted Intel NIC's for my virtual networks. And for my Broadcom NIC's? Well, they can still be used, but in my opinion are well suited for your management and iSCSI connections only.

7 comments:

  1. Roger, that's disconcerting. I've got an R710 with the 5709 in it I run. I'm hoping I don't see these issues.
    ReplyDelete
  2. I have an R710 in my lab testing the 5709 upgrade experience now. I will let you know what I find.

    If you have any 5709 experience, please share.
    ReplyDelete
  3. Im seeing a simular issue with R710's with 5709's.

    The virtual machines will randomly lose network connectivity, the only way to bring a client back online is to reboot it.

    We have this on all our R710's, and because we use Fibre and SAS to connect to SAN's and DAS, we dont have any choice but to use the built in broadcoms.
    ReplyDelete
  4. I am seeing this also on M610's. I have to reboot the VM to re-connect.
    ReplyDelete
  5. I have 3 x R710s running on a cluster with full windows enterprise 2008R2 running with BCM5709C Netextreme II (x4 onboard) without any issues, in fact I am the opposite in that whenever I've come across older dell 1800/1850 machines with Intel cards in them I've had nothing but disconnect problems with them (usually with newer switches)
    ReplyDelete
  6. We have several HP DL180 with 2008R2 and Broadcom Netxtreme and we suffering the same problem.

    Hung comunications and need reboot
    ReplyDelete
  7. Have you found a fix for this issue. I am getting Dell R910 with 5709 and wondering if I will end up having same issue.
    ReplyDelete