Category Archives: Understanding ourselves

Pair Programming Wrap-Up

I previously blogged about the nature of expertise, and resulting questions about the efficiency of expert-expert pairings.  Here I’d like to tidy up some loose ends with regard to the relationship between expertise and pair programming.

How does expertise develop?

Kahneman’s book points out that expertise develops, virtually inevitably, when a practitioner has long experience in a field where both of the following apply:

  1. There is indeed some correlation between decisions and outcomes. (Conversely, no amount of practice will ever give you expertise in predicting the next outcome on a roulette wheel!) In the field of software development, our design decisions do indeed influence the outcomes of correctness, performance and maintainability.
  2. Practitioners receive feedback on their decisions (so that their inner neural network can learn). In other words, we make decisions, and then actually see how those decisions affect the correctness, performance and maintainability of our software. As long as we maintain an open-minded willingness to learn, and we don’t change jobs too often, we will receive this feedback and therefore expertise will develop.

A case in favour of pairing

While I’ve questioned the efficiency of expert-expert pairings, the science of expertise suggests a possible benefit of expert-novice pairings – a benefit which I have not seen described elsewhere. I suspect that such pairing may help the novice to develop expertise more quickly. Not because of what they expert says (as implied by some other descriptions of expert-novice parings), but because of what the expert does. Remember that much of what we call “expertise” takes place unconsciously. So the expert couldn’t put it into words even if they tried. But, perhaps, there is some value in less experienced programmers seeing how the expert works. Maybe that will give the less-experienced member of the pair a chance to pick up on things which the expert could never teach verbally.

A related point is that much of programming is about how the software is produced. Yes, you can learn from reading other people’s code after it’s finished (and we should do this more often) but I think we might learn more from observing how that code is produced – What kind of things did the author try as they worked? Where did they backtrack? How did unit tests guide their work? What are their special Google search tricks for finding relevant information?

(Note: don’t forget the conventional advice that “pair programming is best done by peers”. An expert-novice pairing is arguably not at all the same thing as regular pair programming.  My point here is that it might be a valuable tool for transferring the expert’s “automatic” (“non-concsious”) skills.  I would see it being used in moderation; rather than  for 6 or 8 hours a day like regular pairing.)

The mathematics of paired experts

I discussed my previous post with a friend before writing it. We agreed that experts can’t discuss the first part of their thinking (the automatic bit), and that discussing the second part (the effortful bit) takes longer because the experts have to slow down to the speed of human speech.  However my friend suggested that the collaboration will still pay off, due to improved quality of the final decision.

That could be a very strong argument if feedback from a peer was the only kind of feedback available.  But, it’s not.  Experts also get feedback from the situation, via unit tests etc.

Example: An expert is looking for the cause of a defect.  The bug is reproducible, but the expert doesn’t know what causes it.  Hunting for the bug is like a search.  At each step in the process, the expert chooses a “move” that will narrow the range of possible causes.  Moves may include setting particular breakpoints, looking for patterns in the inputs that cause the bug, stepping through the code, adding additional unit tests, and so on.  Each move narrows down the search space, until eventually the defect is found.  The better the move, the more it narrows the search space.  A really good first move might eliminate 90% of possible causes, leaving only 10% remaining.  A less efficient move may eliminate only 50% of the possible causes, leaving the other 50% still to be searched.

Question: if the expert is paired with another, will the pair choose substantially better moves?  Yes, they probably will.  After some discussion, the pair may choose a “90% move” (eliminating 90% of possible causes) while an expert working alone might only come up with a “70% move”.

But what about the time cost of the pair’s discussion?  In the time that the pair spend discussing one move, a solo expert might try several. Imagine that a solo expert can make two moves in the time it takes a pair to discuss one.  If the solo expert tries one 70% move, followed by another, they’ll cut the search space down to 9% of the original space. [ (100% – 70%)^2 =  30%^2 = 9%].   That’s about the same amount of progress as the pair’s “90% move”.   So the solo expert has produced the same result as the pair, in the same amount of time, but for only half the personnel cost.  If the solo expert’s cycle time is even shorter still, allowing them to try 3 or 4 things in the time that the pair tries one, the solo expert will outperform the pair in both cost and elapsed time.

In summary I suggest that: by using a shorter cycle time a solo expert might outperform a pair of experts.  This a special case of the general principle that “starting to iterate” often beats “improving the plan”.

So how should experts interact?

If experts avoid pair programming, will they end up working completely alone?  No.  I think there are at least two options worth considering:

Option 1: Expert “pair preview”

I’m suggesting this based on a real life example.  I’m calling it “preview” rather than “review” because it comes before the code is written.  I’ll anonymise it by changing the names to “Alice” and “Bob”.

Alice was (and still is) an awesome programmer, whose expertise I greatly respect.  She was working alone on a difficult part of the system.  Once Alice had an initial design in mind, she ran it by Bob for review.  So far, so good.  But the review is where things came unstuck.  Alice and Bob didn’t know Kahneman’s concept of effortful versus automatic thought.  Alice described her proposed design to Bob.  Bob understood it, but had a hunch that an alternative design would be simpler.  Bob didn’t know it, but his half-formed alternative design was generated by the automatic “part” of his mind.  Because it was automatically-generated, it seemed like nothing but a vague hunch.  Consequently neither Bob nor Alice gave it the attention it deserved.

In hindsight, they should have recognised Bob’s hunch for what it was: an instinctive product of Bob’s long expertise, and therefore something worth looking into.  Perhaps Alice should have spent a couple of days spiking Bob’s idea.  That would have tested the hunch, and by having Alice do the spiking (rather than Bob, who had suggested it), we would have leveraged Alice’s existing detailed knowledge of the problem space.  (Having Alice do the spike would have also made it psychologically easier for her to embrace the new idea, if it did indeed prove to be the best.)

Unfortunately, no such spiking happened, and the two programmers did not manage to have an effective discussion of the problem.  Alice spent about 5 weeks implementing her proposed design.  Some years later, it proved inadequate to the growing needs of the system, at which point it was re-implemented along the lines of Bob’s hunch – in only 2.5 days.

Option 2: “Expert Escalates”

Even experts get stuck.  Part of being a responsible expert is to realise when you’re not making productive progress, and seek out someone to bounce ideas off.  Perhaps they will see something that you haven’t, or spark some new ideas.

What about the evidence?

Two commenters on my previous post suggested that my concerns were all very well, but couldn’t be valid because the research shows that pair programming works.  I promised to respond with some comments on “the evidence”.

It turns out that the evidence is rather more mixed that you might suppose. Here’s a quote from the abstract of a meta analysis done by the excellent Simula Research Laboratory.

…between-study variance is significant, and there are signs of publication bias among published studies on pair programming.  A more detailed examination of the evidence suggests that pair programming is faster than solo programming when programming task complexity is low and yields code solutions of higher quality when task complexity is high. The higher quality for complex tasks comes at a price of considerably greater effort
— from here, or alternate link [emphasis added]

The finding, that paired effort for complex tasks is much greater than solo effort, appears consistent with my above concerns and reasoning about paired experts.

Expertise versus Pair Programming

Let’s start this post with a thought experiment.  Not in software development, but in playing chess.

Imagine two novice chess players, working as a team. (We’ll assume their opponent is a computer, so it can’t overhear them talk.)  Our two novices will benefit greatly from their collaboration.  They’ll discuss all their thinking – everything from possible moves, to correcting each others mistakes, to “can you remind me how a knight moves?”.  Working together like this they will make fewer mistakes, and generate better moves than each would have done alone.

But what if we paired two chess masters?  Scientific research proves that the thinking of experts is different. Much of it happens as automatic pattern recognition rather than conscious reasoning.  When a chess master looks at a chess position, (good) possible moves spring to mind immediately.  As described in my previous post, these “candidate moves” are generated directly by the brain’s underlying neural network.  Neural networks can’t explain their own reasoning, so the master doesn’t consciously know why those moves came to mind.  Of course, the moves are indeed based on the master’s long experience, but the mapping from prior experience to current ideas is not open to inspection by the conscious mind.

Continue reading Expertise versus Pair Programming

Becoming an Expert

Imagine looking at a dog.  You instantly know that it is, indeed, a dog.  That’s an incredible feat of pattern recognition, performed almost instantly and without any conscious effort.

Is it really incredible?  Yes.  It just seems easy because you’ve been doing it effortlessly since about age three.  To remind yourself how difficult it actually is, imagine designing an algorithm to recognise dogs.  Exactly what would be the rules for distinguishing a small small dog from a large cat?  How would you define the category “dog” such that it included Chihuahuas and Great Danes,  but not foxes and wolves?

Nobel prize winner Daniel Kahneman uses the dog example to explain the two ways humans think.  One way is effortful thought – what we do when consciously thinking about something.  The other doesn’t even feel like thinking it all.  It’s just effortless automatic perception – like seeing a dog.  Much of our brain’s activity is of the automatic kind.  Kahneman gives several examples to show the difference:

Automatic Effortful
Detect that one object is more distant than another Focus on the voice of a particular person in a crowded and noisy room
Detect hostility in a voice Tell someone your phone number
Answer 2 + 2 = ? Answer 17 x 24 = ?
Drive a car on a empty road (unless you are just learning to drive, in which case this belongs in the “effortful” column) Check the validity of a logical argument


What does this have to do with expertise?  The answer is simply this: expertise involves significant “automatic” thought.  For example, you happen to be an expert in recognising dogs.  You accomplish that task with no conscious effort whatsoever.

The same applies to activities that we normally associate with “expertise”.  Chess masters are a compelling example.  When a chess master looks at a board, several strong moves immediately spring to mind.  Just like you effortlessly “see” that an animal is a dog, a chess master effortlessly “sees” which moves make sense.  This set of “candidate moves” is generated automatically, without conscious effort.  Only after the moves spring to mind does the master actually start consciously thinking about them – to decide which of the candidate moves is best.

As Kahneman writes:

[Experts’] intuitive judgements come to mind with the same immediacy as [a child’s] “doggie!”

An Analogy

It may seem hard to believe, that many of our most difficult mental tasks are performed instantly and unconsciously.  As Kahneman points out, their very automatic-ness leads us to overlook their importance.

When I was reading his book, I found it helpful to recall the university paper I took in artificial intelligence.  There we learned about artificial neural networks. Neural networks implement an approach to computation which is inspired by the structure of the human brain. They are very fast and effective “pattern recognisers” but, having recognised an input as matching a particular pattern, they are completely unable to tell you why it matched the pattern. I.e. they can’t explain their “reasoning”. This exactly matches Kahneman’s description of our automatic thought – it gives you an impression/hunch very quickly, but is unable to tell you why. So:

  • I imagined the brain’s underlying hardware as a neural network – quick to recognise patterns, but unable to explain its logic.  It is here that our automatic thinking takes place.  So it’s easy to see why our automatic thinking is so fast – neural networks are naturally fast pattern recognisers, and ours is implemented directly in “hardware”.
  • I imagined effortful thought as a simulation running on top of the underlying hardware.  Rather like a virtual machine.  Inside the virtual machine, thinking is sequential and logical.  But, because the virtual machine is just an emulation, it runs slowly and has limited working memory.  (Question: does this imply our effortful thoughts aren’t “real”?  No. Just that they have a complicated origin, but our automatic thoughts are not subject to the same performance limitations.)


I wouldn’t dare to suggest that this model is accurate in terms of the underlying biology, but as a software engineer I found it helpful in understanding Kahneman’s work.

Developing Expertise

Consider two chess players, a novice and a master.  The novice relies almost entirely on effortful thought.  “How does a knight move?”, “Let me think what would happen if I moved my rook to here…”.  But the master makes heavy use of automatic thought.  The master’s automatic mind takes care of most of the details that the novice struggles with, leaving the master’s effortful mind free to add value where it is most needed.  Consequently the master can produce better decisions in less time.

But how does the novice become a master?  How does someone using effortful thought slowly transition to creating (good-quality) automatic thoughts?  Through practice.  With enough time, and enough examples, your underlying neural network trains itself to recognise the patterns.

Three sources have influenced my thinking about this kind of practice:

  • James Bach’s Myths of Rigour presentation (which to me, could equally well have been entitled “What is learning?”).  [Video here].
  • Anders Ericsson’s work on deliberate practice
  • Alistair Cockburn’s concept of Shu-Ha-Ri.  To join the dot’s between Shu-Ha-Ri and Kahneman’s work, I’d suggest that the Shu-Ha-Ri progression equates to the same progression described above: from a novice who uses only effortful thought, to an expert who uses a highly efficient blend of automatic and effortful thought.

A Software Example

I recently noticed how various people debug the large software solution that I’ve been working on for some years.  When I need to debug it myself, I don’t get “stuck”. I might not know the cause of the problem, but I always have an idea on what to do next – something that will take me one step closer to finding the problem. Just like experienced players instantly “see” good moves in a chess game; I tend to “see” good moves in the debugging process. A series of such moves, possibly with some backtracking, eventually leads to a solution.  But I’ve noticed junior developers are much more likely to get stuck – when faced with a problem they sometimes have trouble generating the “next move”.   I’ve also noticed that although I generate moves more successfully than juniors, I seem to expend less conscious effort in doing so.

Of course, this is no excuse for me to take an ego trip – after about 7000 hours with this code base, and a decade’s experience on other systems before it, I darn well should have automatic expertise by now!  Junior programmers, or senior ones with less experience of the system at hand, have no choice but to generate fewer moves automatically and fill the gaps with effortful thought.  Over time, their balance will gradually shift from effortful to automatic.


The science of expertise has many implications for how we recruit, train and deploy software engineers.  In a future post, I intend to explore the implications for pair programming.