Why am I seeing such low SMB transfer throughput?
Ok, there's a bit more to the story than the title implies.
Background and Environment: I'm copying several TB from an older Ubuntu
server to a newer Windows 2012 server over SMB. (Technically, it's
commodity hardware, but they're servers around here.) Everybody is on a
gigabit LAN, and the older Ubuntu box has a bonded interface. I believe
the Ubuntu server has two Rosewill PCI-e 1x ethernet cards and the Windows
server has one reasonably nice PCI Intel ethernet card.
The destination computer (the Windows server) is running a Storage Pool
with parity over 4x 2TB drives. It is running Microsoft's new ReFS. The
source computer (the Ubuntu server) is running a software RAID mirror. It
is running good ol' EXT4.
The two servers are running through a single gigabit switch. I have
experimented with breaking the bonding on the source (Ubuntu) computer
without any improvement.
Problem: I have no trouble transferring at reasonable speeds from other
computers to the Windows server. Other computers can hold 50-80MB/s
without much difficulty, but transferring from that Ubuntu server tops out
at no more than 20MB/s. 4+TB at 20MB/s takes a long time (something like
2.3 days), and I'm wondering what I can do to figure out where the
bottleneck is.
Symptoms: CPU on both computers is pretty minimal, and certainly not
prohibitively busy. Hard drives on both computers are active but not
swamped, and CPU IOwait is almost 0% on at least the Ubuntu server.
I did a Wireshark trace for 35 seconds (presumably long enough to make
sure all ACKs were for new packets) and noticed that there were quite a
few things I didn't expect. (1) There weren't any checksums for the ACKs
(and SOME SMB packets) from Windows to Ubuntu. However, Wireshark claims
that this may be due to "IP checksum offload." Ok, I have a pretty nice
card in there. I suppose it is possible that the network card could do
checksum calculations. Fine. Moving on... (2) "TCP ACKed unseen segment."
This one I have a problem with. The ACK number is within an acceptable
range from what I can tell, and there are often huge blocks of these
messages. Perhaps Wireshark is just too slow?
Summary: Transfer speed sucks (20MB/s over gigabit ethernet) and I don't
know why. Wireshark claims Windows is ACKing things that were never sent
by Ubuntu.
Guesses: My initial guess is that the cheaper Rosewill cards are getting
swamped. My second guess is that the software RAID-like things on one end
or the other is getting inundated with stuff to do.
No comments:
Post a Comment