Pair Programming Wrap-Up

I previously blogged about the nature of expertise, and resulting questions about the efficiency of expert-expert pairings. Here I’d like to tidy up some loose ends with regard to the relationship between expertise and pair programming.

How does expertise develop?

Kahneman’s book points out that expertise develops, virtually inevitably, when a practitioner has long experience in a field where both of the following apply:

There is indeed some correlation between decisions and outcomes. (Conversely, no amount of practice will ever give you expertise in predicting the next outcome on a roulette wheel!) In the field of software development, our design decisions do indeed influence the outcomes of correctness, performance and maintainability.
Practitioners receive feedback on their decisions (so that their inner neural network can learn). In other words, we make decisions, and then actually see how those decisions affect the correctness, performance and maintainability of our software. As long as we maintain an open-minded willingness to learn, and we don’t change jobs too often, we will receive this feedback and therefore expertise will develop.

A case in favour of pairing

While I’ve questioned the efficiency of expert-expert pairings, the science of expertise suggests a possible benefit of expert-novice pairings – a benefit which I have not seen described elsewhere. I suspect that such pairing may help the novice to develop expertise more quickly. Not because of what they expert says (as implied by some other descriptions of expert-novice parings), but because of what the expert does. Remember that much of what we call “expertise” takes place unconsciously. So the expert couldn’t put it into words even if they tried. But, perhaps, there is some value in less experienced programmers seeing how the expert works. Maybe that will give the less-experienced member of the pair a chance to pick up on things which the expert could never teach verbally.

A related point is that much of programming is about how the software is produced. Yes, you can learn from reading other people’s code after it’s finished (and we should do this more often) but I think we might learn more from observing how that code is produced – What kind of things did the author try as they worked? Where did they backtrack? How did unit tests guide their work? What are their special Google search tricks for finding relevant information?

(Note: don’t forget the conventional advice that “pair programming is best done by peers”. An expert-novice pairing is arguably not at all the same thing as regular pair programming. My point here is that it might be a valuable tool for transferring the expert’s “automatic” (“non-concsious”) skills. I would see it being used in moderation; rather than for 6 or 8 hours a day like regular pairing.)

The mathematics of paired experts

I discussed my previous post with a friend before writing it. We agreed that experts can’t discuss the first part of their thinking (the automatic bit), and that discussing the second part (the effortful bit) takes longer because the experts have to slow down to the speed of human speech. However my friend suggested that the collaboration will still pay off, due to improved quality of the final decision.

That could be a very strong argument if feedback from a peer was the only kind of feedback available. But, it’s not. Experts also get feedback from the situation, via unit tests etc.

Example: An expert is looking for the cause of a defect. The bug is reproducible, but the expert doesn’t know what causes it. Hunting for the bug is like a search. At each step in the process, the expert chooses a “move” that will narrow the range of possible causes. Moves may include setting particular breakpoints, looking for patterns in the inputs that cause the bug, stepping through the code, adding additional unit tests, and so on. Each move narrows down the search space, until eventually the defect is found. The better the move, the more it narrows the search space. A really good first move might eliminate 90% of possible causes, leaving only 10% remaining. A less efficient move may eliminate only 50% of the possible causes, leaving the other 50% still to be searched.

Question: if the expert is paired with another, will the pair choose substantially better moves? Yes, they probably will. After some discussion, the pair may choose a “90% move” (eliminating 90% of possible causes) while an expert working alone might only come up with a “70% move”.

But what about the time cost of the pair’s discussion? In the time that the pair spend discussing one move, a solo expert might try several. Imagine that a solo expert can make two moves in the time it takes a pair to discuss one. If the solo expert tries one 70% move, followed by another, they’ll cut the search space down to 9% of the original space. [ (100% – 70%)^2 = 30%^2 = 9%]. That’s about the same amount of progress as the pair’s “90% move”. So the solo expert has produced the same result as the pair, in the same amount of time, but for only half the personnel cost. If the solo expert’s cycle time is even shorter still, allowing them to try 3 or 4 things in the time that the pair tries one, the solo expert will outperform the pair in both cost and elapsed time.

In summary I suggest that: by using a shorter cycle time a solo expert might outperform a pair of experts. This a special case of the general principle that “starting to iterate” often beats “improving the plan”.

So how should experts interact?

If experts avoid pair programming, will they end up working completely alone? No. I think there are at least two options worth considering:

Option 1: Expert “pair preview”

I’m suggesting this based on a real life example. I’m calling it “preview” rather than “review” because it comes before the code is written. I’ll anonymise it by changing the names to “Alice” and “Bob”.

Alice was (and still is) an awesome programmer, whose expertise I greatly respect. She was working alone on a difficult part of the system. Once Alice had an initial design in mind, she ran it by Bob for review. So far, so good. But the review is where things came unstuck. Alice and Bob didn’t know Kahneman’s concept of effortful versus automatic thought. Alice described her proposed design to Bob. Bob understood it, but had a hunch that an alternative design would be simpler. Bob didn’t know it, but his half-formed alternative design was generated by the automatic “part” of his mind. Because it was automatically-generated, it seemed like nothing but a vague hunch. Consequently neither Bob nor Alice gave it the attention it deserved.

In hindsight, they should have recognised Bob’s hunch for what it was: an instinctive product of Bob’s long expertise, and therefore something worth looking into. Perhaps Alice should have spent a couple of days spiking Bob’s idea. That would have tested the hunch, and by having Alice do the spiking (rather than Bob, who had suggested it), we would have leveraged Alice’s existing detailed knowledge of the problem space. (Having Alice do the spike would have also made it psychologically easier for her to embrace the new idea, if it did indeed prove to be the best.)

Unfortunately, no such spiking happened, and the two programmers did not manage to have an effective discussion of the problem. Alice spent about 5 weeks implementing her proposed design. Some years later, it proved inadequate to the growing needs of the system, at which point it was re-implemented along the lines of Bob’s hunch – in only 2.5 days.

Option 2: “Expert Escalates”

Even experts get stuck. Part of being a responsible expert is to realise when you’re not making productive progress, and seek out someone to bounce ideas off. Perhaps they will see something that you haven’t, or spark some new ideas.

What about the evidence?

Two commenters on my previous post suggested that my concerns were all very well, but couldn’t be valid because the research shows that pair programming works. I promised to respond with some comments on “the evidence”.

It turns out that the evidence is rather more mixed that you might suppose. Here’s a quote from the abstract of a meta analysis done by the excellent Simula Research Laboratory.

…between-study variance is significant, and there are signs of publication bias among published studies on pair programming. A more detailed examination of the evidence suggests that pair programming is faster than solo programming when programming task complexity is low and yields code solutions of higher quality when task complexity is high. The higher quality for complex tasks comes at a price of considerably greater effort
— from here, or alternate link [emphasis added]

The finding, that paired effort for complex tasks is much greater than solo effort, appears consistent with my above concerns and reasoning about paired experts.

5 comments on “Pair Programming Wrap-Up”

John Rusk says:

March 10, 2012 at 8:15 pm

BTW there is an interesting counter-argument here https://www.higherorderlogic.com/2008/06/test-driven-development-a-cognitive-justification/ , in which it is suggested that we _should_ aim to slow experts down. I’m not sure I buy it, but I do find it interesting
Arlo Belshee says:

July 14, 2012 at 5:48 pm

Experts need not take longer cycles in pairs than they do solo. In fact, if you stand back and watch with a stopwatch, you might find a surprising result.

When I’ve done this, I found that pairs cycled more quickly. They actually tried more ideas per unit time. My observation was that pairs went the same speed when they weren’t blocked and they were blocked less frequently and for shorter periods.

And when questioned, the pairs nearly always reported that they were cycling more slowly. People didn’t feel the blocks when they were solo (because they were thinking?). They did feel the drag of conversation.

My teammates were really surprised by the results. So we re-ran with each of them being observer for a half-day. We still had trouble believing the stopwatch, but the evidence was clear.
John Rusk says:

July 15, 2012 at 1:22 am

Interesting. If I understand correctly, the reason why cycle time is not harmed by pairing is as per your comment here https://www.agilekiwi.com/other/agile/expertise-meets-pair-programming/

By the way, if a (non-blocked) pair cycles at the same rate as a (non-blocked) solo, then superficially that would suggest that pairing takes twice as many person-hours to accomplish the same work. Of course, as you say, pairs remain non-blocked for a greater percentage of time than solos. From your measurements, are you able to guestimate the overall cost-effectiveness of pairing versus soloing?. (Bearing in mind not just cycle times, but also different defect rates and therefore different costs of defect resolution. And also faster learning, as per your recent comment on your own blog – https://arlobelshee.com/post/is-pair-programming-for-me).
Arlo Belshee says:

November 21, 2014 at 10:09 am

I don’t have productivity data for the individual level, but that doesn’t matter: I have productivity data for the team as a whole.

I compare teams that work as individuals with those who execute 100% of the work in pairs (with rotations), and exclude the first 5 weeks of pairing (allowing time to build skill and stabilize team norms). I do this by comparing a team with itself, before and after it pairs.

Pretty much every time (100% in my data, but exceptions probably exist), I see the pairing team complete around 200-400% of the work it did as solos. Same product, same team members, same kind of work. This is all-in productivity.

I measure defect-free value delivered to the end customer. So bug fixes don’t count as productivity (it is failure demand). You get the same credit for delivering something right the first time as delivering it wrong and then fixing it. I measure defect-free value delivered per team-week. That number typically goes up by 2x-4x. It goes up more the more screwed up the team was before starting.

By point of reference, I find the independent productivity gains from TDD (tiny cycle, done right) to be about 1.5-2x and from Refactoring (mechanized, done right) to be about 2x-3x. Taken all together, they tend to get to right around 10x (there is some overlap, presumably errors that would be prevented by any of the 3 practices or learning that would be incorporated similarly).

At this point, I consider an XP transition to have failed if it doesn’t get at least a 5x improvement in real productivity by the end of month 2.
John Rusk says:

November 22, 2014 at 12:25 pm

Hi Arlo,

Thanks for your reply. At what stage of the project do you typically work with teams? Near the start, or at least during the first 3rd of the project?

Second question: what about a team that isn’t “screwed up” very much at all? For instance, I was recently involved in a project that most of us feel was particularly successful. 20+ people, 18 months, delivered on time and “on scope” with almost no overtime worked. Most of us really enjoyed the project, and for a number of reasons we felt it was definitely the best we’d seen in our careers. But we weren’t pairing. Which raises the question: would we have delivered in half the time if we’d been pairing? That’s what your data would suggest. And yet, the team was performing so well, I find it hard to imagine that any change could have doubled its performance.

One final question: are the teams you work with composed almost entirely of developers, or do they have, say, a roughly even split between business analysts, developers and testers? I can imagine that the role, and value, of pairing may differ between these two kinds of team composition.

Comments are closed.

AgileKiwi

The Neglected Essentials of Software Development