Using mbuffer to speed up slow zfs send | zfs receive

July 18th, 2010

So, you find yourself doing a zfs send | receive, perhaps a large incremental send. You find the transfer process going really slowly – trickling along at less than 1MB/sec. Yet, you know the sender and the receiver is capable of far far more than this. What’s the deal?

Well, basically, zfs receive is bursty – it can spend ages computing something, doing no receiving, then blat the data out to disk. The issue with this is that it stalls the sender, resulting in a bursty and slow transfer process.

The solution is to deploy mbuffer into the mix. MBuffer will buffer the stream, which you can do at both ends. While ZFS receive can’t receive, mbuffer buffers, when zfs receive can receive, mbuffer sends it data as fast as it can. Let’s see an example:

# Start the receiver first. This listens on port 9090, has a 1GB buffer,
    and uses 128kb chunks (same as zfs):

mbuffer -s 128k -m 1G -I 9090 | zfs receive data/filesystem

# Now we send the data, also sending it through mbuffer:

zfs send -i data/filesystem@1 data/filesystem@2 | mbuffer -s
    128k -m 1G -O 10.0.0.1:9090

You’ll get a lovely output such as:

in @ 15.8 MB/s, out @ 8923 kB/s, 1.0 GB total, buffer   6% full

And hopefully, you’ll find your zfs send|receive suddenly go a lot lot quicker.

Here is the link to download the useful mbuffer program.

I personally found this approach decreased my send time for 40GB from over 4 hours down to around 30 minutes. Hurrah! :)

Entry Filed under: General

12 Comments Add your own

  • 1. Eric Sproul  |  July 21st, 2010 at 8:34 pm

    This looks interesting, but you don’t mention what transport you were using before mbuffer. Assuming ssh, how much of the speedup can be attributed to mbuffer versus not having to do the ssh crypto?

    Folks might also want to remember that the mbuffer traffic is going over the network in the clear, which might be a concern in some environments.

  • 2. Alasdair  |  July 28th, 2010 at 7:26 am

    Hi Eric,

    Obviously people can still use mbuffer in conjunction with SSH so you continue to get an encrypted transport stream and you’ll benefit from the buffer as well.

    This is actually the case for what I was using mbuffer for; I just dropped mbuffer into the backup script on the receiving side, and SSH is still the transport stream. This gave a large speed up.

    SSH Encryption gives you a fairly constant transfer speed. The issue we had specifically was that ZFS Receive would hang about not receiving any data whilst performing computations, then do an enormous burst of activity. Adding mbuffer helps tremendously for this case as it buffered data during the stall, then could feed ZFS receive as fast as it could go during the burst of activity.

    Alasdair

  • 3. Andrew Gabriel  |  August 2nd, 2010 at 10:34 pm

    See the bug I raised on this: http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6729347

    The receiver does a giant read per TXG. Between these reads, the standard TCP network buffers fill and flow control off. When the disks and the network are similar bandwidths, you will see alternate network, then disk activity. By having a buffer on the receive side which can hold 5 seconds worth of data at the rate the disk and network can stream, you will then see the dataflow stream continuously. I didn’t see any significant benefit of a buffer at the sender, but a small one can’t do any harm.

  • 4. sh0x  |  August 30th, 2010 at 3:23 am

    Sweet! Thanks for the tip. My zfs send/recv went 45Mb/s using SSH for transport. I know there is network and cpu overhead with SSH, but the same transfer using mbuffer went 112MB/s (w/out SSH).

    For my home whitebox setup I can’t ask for more throughput than that!

    Ok I can, time to bundle NICs. :)

  • 5. Anthony  |  October 12th, 2010 at 7:49 pm

    I’m glad I stumbled across this post.

    I have an X4600 M2 that’s being replaced by an X4270 M2 with new disks – I have a dedicated interface on both for data migration. Lots of cycles and physmem on both sides. zfs send/receive over ssh defaults got me on the order of 24 MB/s. Using the blowfish cipher maxed at ~52 MB/s, and the arcfour cipher at ~67 MB/s. I’ve tested today using mbuffer as above, and consistently get 109-116 MB/s throughput with the sending buffer pegged 100% full and the receiving buffer mostly empty, occasionally rising up to 25% or so briefly before falling back to 0. I interpret this as indicating that the GigE interfaces are the bottleneck, and suspect that with 10GE interfaces I could easily double and likely triple throughput. I may try adding compression into the pipes to see how that affects time to completion.

  • 6. Shawn  |  December 27th, 2010 at 5:10 am

    so mbuffer is working great for network transfers however…How can it be used to speed up local zfs send and receives locally. I have not had any luck speeding up a local sync between to local pools. When I try it the buffers just fill and then I am back to trickle mode again. Any thoughts?

  • 7. Alasdair  |  January 23rd, 2011 at 12:28 am

    Shawn – MBuffer probably won’t help in your case. If you’re getting a really slow zfs send/receive speed locally, it’ll be due to an issue that you’ll need to track down.

    If you’re using dedupe, that can be really slow. You might have a dodgy disk/controller and/or drivers, or you could be on an old version of ZFS.

    Whenever I see really slow zfs send/receive speeds in the past it has been due to IO errors, either due to dying disks or a disk/controller Solaris didn’t like. “iostat -En” error counts can be helpful here, and “iostat -x” can help identify the slow party.

  • 8. John Laur  |  January 29th, 2011 at 4:34 am

    If you are using mbuffer on a local replication it’s both tedious, complicated and unnecessary to use two copies running on the same machine. Try this syntax:

    zfs send … | mbuffer -s 128k -m 2G -o – | zfs receive …

    the “-o -” puts mbuffer into file output mode, but then sends to stdout instead. Fancy!

  • 9. Jim Sloey  |  November 11th, 2011 at 5:31 pm

    Is anyone using this in an automated script?

  • 10. John Ryan  |  March 12th, 2012 at 3:13 pm

    I’ve automated this in a script like:
    zfs send pool/tank@now | mbuffer -s 128k -m 1G 2>/dev/null | ssh -c arcfour128 remote.machine “mbuffer -q -s 128k -m 1G 2>/dev/null | zfs recv -F -v pool/tank”

    also works with zfs send -i
    tests show me arcfour128 speeds it up as well.

  • 11. Mozaik. » Blog Arch&hellip  |  June 13th, 2012 at 6:03 pm

    [...] syncing the changes to USB took long (days), this took weeks. That’s when I learned about mbuffer to speed up zfs sync, but dedup and compression still took their toll. Plain uncompressed and [...]

  • 12. zfs send receive performa&hellip  |  October 29th, 2012 at 10:46 pm

    [...] [...]

Leave a Comment

Required

Required, hidden

Some HTML allowed:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Trackback this post  |  Subscribe to the comments via RSS Feed