Guaranties are limits, in the limit

One of the things I like to do is to challenge myself with interesting problems in computer science. It’s fun to see if I can figure out how to solve them, now that someone has done it.


One particular thing I’d love to do is provide minimum resource guarantees to customers of a program, not processes.
Solaris had per-process “CPU shares”, Linux has “CPU control groups”. I love them. They provide minimum guarantees rather than hard upper limits. If I gave a 50% CPU guarantee to program A, but if I’m not totally busy, then A is welcome to take more than 50%. Only if the system is 100% busy will the program be limited to 50%.
They’re lovely, but they’re also complicated. The authors spent a lot of time on possible edge-cases and lovely, general, algorithms.
I’m not that smart. In fact, I’m lazy. I like what mathematicians do, “strength reducing” a hard problem into a solved problem plus a little overhead to do the transformation.
Skipping ahead, it turns out one of the strength reductions applies here. If you only enforce guarantees when the system has already started to get overloaded, there’s a really small difference between a limit and a guarantee. And limits are a solved problem.


So lets set the stage, and see if this will work.
Imagine I have a program that accepts a REST request. It either returns 200 OK or a 503 Service Unavailable with a Retry-After header when it’s temporarily too busy.
Also imagine I have two interesting customers, A and B. I also have another bunch, C through Z.
  • A is big but not very profitable, and I give them a guarantee of 50 connections.
  • B is larger and pays better, so I give them 100 connections, in the hopes they’ll make more money and pay me more commission.
  • C through Z get the rest of the connections, out of 500.

Doing it wrong

Imagine what would happen if I enforced limits: if I limit at A to 50 connections, and there is CPU free, I’ll lose money. That’s bad.
Ditto if I limit B and there’s CPU free.
Conversely, if I don’t limit A and there isn’t CPU free, A will overload me, and I’ll slow down. Pretty soon I start timing out and losing money.
I’ll also starve B of connections, again losing me money.
So I don’t want to use limits at low load, but I do want to limit A and B at high loads. At 100% load, I want to limit them to exactly their guarantees, 50 and 100 connections.
Stating the problem that way made the answer below jump out at me.


Let’s treat guarantees as limits only if we’re about 90% busy. In our case is a load of roughly 450 out of 500 connections.
The code to do that might look like this:
var busy int    // busyness, in connections
var cust int    // the customer
var stuff Stuff // customer attributes

for busy= 0;  ; busy =  removeCompletedRequests(busy) {
	pub, stuff := getNextCustomerRequest()
	if busy > 450 {
		// apply guarantees as if they were limits
		current :=  inUse[cust]
		if current < guarantee[cust] { // 50 for A, 100 for B
			inUse[cust] =  current
			dispatch(cust, stuff)
		} else {
			// don't let this customer run
			reject(cust, stuff)
	} else {
		// no problem, let it run
		dispatch(cust, stuff)
So what happens to our two customers, A and B? At low load they get everything they ask for, and our definition of low is everything smaller than 450. So everyone gets the resources they want, and we make money.
Above 450 connections, everything changes…

First, A hits 50

If A is already above 50, lets say at 60 connections, then A stops getting connections. All the existing connections are fine. They’re not affected, and run to completion.
A starts getting rejections. Eventually 11 of A’s connections complete, and A now is only using 49. The next request gets them a connection, and A then continues at its maximum of 50 connections.
B is still getting everything they asked for, if they haven’t hit 100 yet…

Now B hits 100

OK, they now start getting rejections too.
If A has 50 and B has 100, everyone else must have 450 – 150 = 300 connection requests among them, and there are now only 50 connections worth of headroom. The system is pretty darned active, and is making money.
The fact that I limited A and B at what is a fair share of the resources isn’t costing me money. If I didn’t limit them, I’d start slowing down, timing out and … losing money.
And I react quickly: each time around the loop looking at a new publisher, I’ve checked how many connections have completed and are available again. As soon as a publisher is below their guarantee, they get more connections. As soon as we drop below 450, everybody gets connections


Since this is me, that’s not a particularly good program. It’s single-threaded, it uses expensive structures like hash maps and the like, but it is O(1). It needs to be rewritten as idiomatic Go, using a select on a stream of offered work and another stream of connections that have finished.
But the question I’m asking is  “can you get there from here?” not “how good is the code?”. And it looks like we can. Guaranteed minimums, in the limit, are just limits, and therefor a mere matter of programming.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s