Evil Spock has to
- Stop Bones’ brain,
- Read out what he wants to know, and
- Unlock Bones’ brain,
so he can breathe.
Of course, this is Evil Spock, so he may not care if Bones breathes or not.
Rob Pike has a better way
Rob wrote, “don’t share memory to communicate, share memory by communicating”.
Let’s consider a real-world example: I have 200-odd stores generating sales data, sending me a record for every transaction from every cash register. I want to extract per-minute totals for each of them, so I can look at a graph and see if they’re having an outage, a good sales day or a problem. I don’t want to wait for my big-date pipeline to run hourly, so I need to drink from that firehose.
In Java or Concurrent Pascal, I’d have a table, and I’d need to
- find a row that corresponds to the right store
- lock it for update
- add to it’s contents
- unlock it
in real time, as fast as I can read records from my Kafka cluster.
Once a minute, I need to read through the whole table
- locking each row, so it can’t change
- taking a total and average
- reporting the count, total and average
- and unlocking the row.
Kafka will have to wait for the calculation and the writing out, and then catch up.
Can you say “contention”? I’m telling the table to stop breathing while I mind-meld 200-odd results out.
And locks are slow: they have to go all the way to main memory, and back, and can’t be cached. And when they contend, everyone has to wait for the owner of the lock before they can go back to doing work.
Set up a worker for each store. Give it a little one-row table, and a pipe to read inputs from.
Have a distributor read kafka and write each record to the pipe for that particular store.
The worker sits in a loop, reading a record and updating the table. Once a minute, instead of reading, it computes the count, average and total and writes them out. Then it goes back to reading the pipe. It doesn’t need any locks whatsoever, and only has to hold it’s breath for the time it take to do one division, a printf and set three variables to zero.
I measured it: with go 1.71 and a 4-core laptop, if the time inside the locked region exceeds 100 nanoseconds, the locks contend, and my test program goes from 1,367 ns/operation to 28,937 ns/operation. Twenty-one times slower.
Do you see why Bones looks so unhappy?