Bufferbloat: A Best Practice You're Probably Doing Wrong

You may have heard of bufferbloat, but even many good network admins haven’t. If you haven’t changed settings to avoid it, then you aren’t doing it right. But what is it, why should you care, and what should you do about it? If you see your network access stutter occasionally for no apparent reason, then read on – it may be that bufferbloat is at fault. The good news is that if you’re running on a recent Linux system, it’s easy to set up correctly.

What is Bufferbloat?

Bufferbloat can when you mix high-bandwidth traffic like backups or Dropbox or video streaming in with small packets that you want to respond to quickly. This can happen on a host(s) running several different types of traffic (from multiple virtual machines, or a mixture of different kinds of web site traffic) – or sharing a common network connection.

What happens is that the sum of all the high bandwidth traffic (“elephants”) fills up the network and creates a bottleneck which starves out the quick turnaround traffic (“mice” and “ants”). The result of this is that you can have multiple second delays when things get congested. There are lots of different explanations of how this happens, and how the solutions work, but the main thing is that you need to fix this on your hosts, your virtualization layers, your network gear, and perhaps most importantly, get your upstream network providers to fix it as well. For the purposes of this article, we’ll concentrate on how to fix it on your Linux systems – real or virtual. For good measure, the same fix likely applies to any Linux-based network gear that you have.

Bufferbloat on the IT Best Practices Project

Let’s see what the IT Best Practices project says about bufferbloat – which describes it and tells how to fix it. We’ve written about the IT Best practices project before – it’s a cool open source project dedicated to collecting freely-available best practices. Although it concentrates on security best practices, it also has this networking best practice as well – which is also implemented by the Assimilation Suite. To explain it in more detail, and see how to fix it, let’s just refer to them for what to do. They refer to this practice as itbp-00001. Here’s what they say about it there:

Short description (bufferbloat)

The default network queuing discipline should avoid buffer bloat.

Long description (bufferbloat)

The default network queuing discipline should avoid buffer bloat – which destroys latency. net.core.default_qdisc sets the default queuing mechanism for Linux networking. It has very significant effects on network performance and latency. sch_fq_codel is the current best queuing discipline for performance and latency on Linux machines. It is the current best practice for controlling bufferbloat. As of October 2014, the second best discipline is sch_fq. For details from an entertaining 2014 presentation clearly explaining bufferbloat at the Linux Plumbers Conference by Stephen Hemminger[1], see this Linux Weekly News article.

Check (bufferbloat)

To check that the correct queue discipline is enabled to avoid bufferbloat, execute the following command:

# cat /proc/sys/net/core/default_qdisc

The preferred result is ‘sch_fq_codel’. sch_fq is also an acceptable result. All other values result in bufferbloat.

Fix (bufferbloat)

To permanently fix this issue, add the following line to /etc/sysctl.conf. This will take effect when the machine reboots

net.core.default_qdisc=sch_fq_codel

An immediate temporary fix can be accomplished by executing this command:

# echo sch_fq_codel > /proc/sys/net/core/default_qdisc

Bufferbloat in the Assimilation System Management Suite

The Assimilation System Management Suite includes this best practice. It does it using this /proc/sys rule

"itbp-00001": {
   "category": "networking",
   "rule": "IN($net.core.default_qdisc, sch_fq, sch_fq_codel)"
}

So, what does this mean, and what kind of context makes this expression make sense?

First of all for the context, all best practice rules in the Assimilation suite are evaluated in the context of some incoming discovery data – in this case the output of the proc_sys discovery script. Like all discovery agents, it outputs JSON. This script outputs all the various /proc/sys values in a JSON format similar to {“net.core.default_qdisc”: “fq_codel”}. In this context, $net.core.default_qdisc has the value “fq_codel”. The function call IN evaluates its first argument, and if that value is found in its remaining arguments, it returns True. This is equivalent to the Python expression net_core_default_qdisc in (“sch_fq”, “fq_codel”). All our best practice rules (GraphNodeExpressions) are function call expressions similar to these.

The name itbp-00001 is a reference to the corresponding IT Best Practice rule. The category is networking – as this is a networking rule – helping you to figure out who is should be informed about rule violations.

It’s not hard to figure out – although you do have to know the JSON the particular discovery agent produces looks like. The cool thing is that you can discover anything you want to, and then write best practice rules on the data you’ve discovered.

This rule gets evaluated when a system first comes up, and when something in /proc/sys changes. In effect, it only gets evaluated when the corresponding discovery agent (proc_sys) reports new data. Since these things rarely change, this means that we rarely have to look at this rule – and we do it as soon as something changes, and every time we need to. This is all a natural consequence of our RNNIGN protocol.

Summary

Bufferbloat is hard to explain, but fortunately easy to fix on Linux hosts. But don’t stop with your hosts – check with your network vendors to see what they’re doing to address bufferbloat on your network, and make sure you’ve implemented the fixes. As a bonus we got to learn all about this from the cool IT Best Practices project – which had a pretty reasonable practical guide to it.

Pretty good. A couple comments:

1) I would not recommend “codel” as a current best practice. Codel by itself is primarily there as a test of the algorithm. sch_fq_codel is a good general purpose default, but on servers that are primarily using tcp, sch_fq is more desirable. A box doing routing (ip forwarding) should use sch_fq_codel, a vm doing primarily tcp should use sch_fq, the bare metal under it (basically doing forwarding), fq_codel.

I know of no reliable way to determine if a box is virtualized. :(.

There is a new qdisc out there, called cake, which may one day become a best practice in more scenarios after it’s done (more testing is needed)

http://www.bufferbloat.net/projects/codel/wiki/CakeTechnical

2) a router box is often best configured to do shaping of some sort (using cake, htb, or hfsc) before bringing online fq_codel.

So I would modify your best practice detector to look for fq, fq_codel, htb, cake, hfsc. When parsing for htb or hfsc, look for codel, fq_codel, or cake as sub qdiscs.

3) your sysctl is wrong and needs an =

4) The presence of BQL on the hardware would also be nice to detect.

5) Nothing at the moment works particularly well on wifi – but work is in progress.

https://www.youtube.com/watch?v=Rb-UnHDw02o

Aside from that thx very much for your efforts in making for replicatable best practices. We’re getting there….

5 thoughts on “Bufferbloat: A Network Best Practice You’re Probably Doing Wrong”

dave taht says:

February 12, 2016 at 12:19

Pretty good. A couple comments:

1) I would not recommend “codel” as a current best practice. Codel by itself is primarily there as a test of the algorithm. sch_fq_codel is a good general purpose default, but on servers that are primarily using tcp, sch_fq is more desirable. A box doing routing (ip forwarding) should use sch_fq_codel, a vm doing primarily tcp should use sch_fq, the bare metal under it (basically doing forwarding), fq_codel.

I know of no reliable way to determine if a box is virtualized. :(.

There is a new qdisc out there, called cake, which may one day become a best practice in more scenarios after it’s done (more testing is needed)

http://www.bufferbloat.net/projects/codel/wiki/CakeTechnical

2) a router box is often best configured to do shaping of some sort (using cake, htb, or hfsc) before bringing online fq_codel.

So I would modify your best practice detector to look for fq, fq_codel, htb, cake, hfsc. When parsing for htb or hfsc, look for codel, fq_codel, or cake as sub qdiscs.

3) your sysctl is wrong and needs an =

4) The presence of BQL on the hardware would also be nice to detect.

5) Nothing at the moment works particularly well on wifi – but work is in progress.

https://www.youtube.com/watch?v=Rb-UnHDw02o

Aside from that thx very much for your efforts in making for replicatable best practices. We’re getting there….

dave taht says:

February 12, 2016 at 12:31

ah, in looking this over further, you are looking for the default qdisc, not the actual qdisc on the interfaces. So a “default” of sch_fq or sch_fq_codel would be a “pass”, IMHO.

detecting if BQL was available on the ethernet interfaces would be nice, too, as I said.

OSSAlanR says:

February 12, 2016 at 13:34

Dave:
Thanks for your comments!

I fixed (3), and the expression for the default value as per your second note. I fixed the expression here on the web page, and in the Assimilation Project: https://github.com/assimilation/assimilation-official/commit/4b6b21246aa6dbd1783468981c39c96ce503d1e8

I invited you to join the IT Best Practices project on Github. That way you can improve the description of the BufferBloat best practice.

OSSAlanR says:

February 12, 2016 at 13:52

We can discover anything we want to (and know how to) discover. I’d like to discover BQL, but I don’t know how. Our current network discovery tool is pretty cool, and you can find it here: https://github.com/assimilation/assimilation-official/blob/master/discovery_agents/netconfig

Suggestions or patches are always appreciated ;-).

Pingback: Linux Networking – Collected Links

Assimilation Systems Limited

Award-Winning Highly Scalable Discovery-Driven System Management Suite

Bufferbloat: A Network Best Practice You’re Probably Doing Wrong