January 28, 2012 | John Rusk | 13 Comments Let’s start this post with a thought experiment. Not in software development, but in playing chess. Imagine two novice chess players, working as a team. (We’ll assume their opponent is a computer, so it can’t overhear them talk.) Our two novices will benefit greatly from their collaboration. They’ll discuss all their thinking – everything from possible moves, to correcting each others mistakes, to “can you remind me how a knight moves?”. Working together like this they will make fewer mistakes, and generate better moves than each would have done alone. But what if we paired two chess masters? Scientific research proves that the thinking of experts is different. Much of it happens as automatic pattern recognition rather than conscious reasoning. When a chess master looks at a chess position, (good) possible moves spring to mind immediately. As described in my previous post, these “candidate moves” are generated directly by the brain’s underlying neural network. Neural networks can’t explain their own reasoning, so the master doesn’t consciously know why those moves came to mind. Of course, the moves are indeed based on the master’s long experience, but the mapping from prior experience to current ideas is not open to inspection by the conscious mind. So how will our two chess masters collaborate? Neither can explain to the other why they thought of particular moves – because the conscious, effortful part of the mind doesn’t actually know. So what will their conversation consist of? They can’t discuss the justification for their proposed moves, because as we just saw, they don’t know the justification. (Several writers, including Nobel-winner Daniel Kahneman, point out that if you ask an expert, “Why did you suggest that?”, their effortful mind will indeed come up with a justification of the idea, but the justification is produced after the fact. It doesn’t necessarily match the invisible reasoning that originally created the idea. In fact, I’d suggest that since the invisible reasoning consisted of pattern matching in a neural network, any attempt at a logical sequential description would be incomplete at best.) Instead, imagine that they pool their sets of candidate moves, and then discuss which of those moves to actually play. This approach makes sense because, as we saw in the previous post, selecting one of the candidate moves requires effortful thought. And effortful thoughts can be perceived and described by the thinker. So let’s assume our two chess masters try to share these thoughts. If they do, I believe they’ll run into another problem: verbalizing the thought process slows it down – a lot. A master thinks through many possibilities: What would happen if I play this move? How would my opponent probably respond? How would I respond to that? A master can process these “what ifs” relatively quickly. Not nearly as quickly as the automatic thinking that produced the initial set of candidate moves, but much, much faster than the rate of human speech. To share thoughts verbally, the master must slow down to speaking speed. In summary, although it’s possible to verbalize these thoughts, there’s a high (performance) cost in doing so. Pair Programming Now let’s consider the analogy to pair programming. Novice programmers don’t yet make significant used of automatic pattern recognition, instead they rely more heavily on effortful thought. Therefore their thoughts are open to introspection and can be verbally shared. Experts, on the other hand, make significant use of the “automatic” part of their mind. So most of their thought processes are not open to conscious introspection, and the remainder are so fast that verbalization carries a huge performance cost. I think this may explain the confusion and frustration that some experienced programmers, including myself, feel when we’re asked to pair. We can’t see how to map our automatic thought processes into the driver/navigator model of pair programming. Furthermore we probably can’t explain what the problem is – at least, I know I couldn’t, until I read Kahneman’s effortful/automatic description of the brain. His book points out that because automatic thought is, well, automatic, we lack awareness of its role in our thinking. Since we lack awareness, we struggle to explain the difficulty we have with pairing. I hope that greater awareness of Kahneman’s work will give us a suitable vocabulary to describe the problem. I also hope to stimulate discussion. The thought processes of an expert, which elegantly combine the automatic with the logical, are extremely efficient. I believe pairing undermines this efficiency – by leading coders to create after-the-fact justifications of their automatic intuitions, and forcing them to unpack their conscious thoughts into spoken language. I suspect paired experts may even find themselves forced back into novice-like patterns of thought. What do you think, does pairing prevent experts from performing at their best? Update, 31 Jan: I should clarify what I mean by “expert”. I use the term in the sense used by Daniel Kanheman, Anders Ericsson and other researches into expertise. But you might have seem some pair programming studies use the term “expert” in a different way, particularly those studies where all test subjects were students. In those studies the word “expert” simply means “good performer” – a student who gets unusually good grades. These cannot be experts in the Kahneman/Ericsson sense of term. Developing that kind of expertise takes many years, far longer than any university degree. Furthermore, Kahneman/Ericsson expertise is not simply about possessing knowledge, in the manner of an A student; instead it is about a mode of thought in which much of the knowledge is possessed and processed unconsciously, through automatic pattern-recognition honed by long experience. Update, 15 July: Arlo Belshee replied with a very informative comment, below. He addresses the concerns of this post as follows: Pairings of experts can work very well. But they have to stop explaining why to each other, and just state partial what. Assume the other expert gets the why, or can create his own. Don’t slow down the thinking to talk; shorten the talking to fit within the fast cycles and focus on the part that will challenge the other guy’s thinking. [emphasis added]
Very insightful post thanks. I don’t program any more but I have struggled to understand how pair programming could really work. When coding you are deep in about 6 layers of thought down some rabbit hole of an idea, having to verbalise every step would be an impediment. Then again, there is compelling evidence (I’ve heard) that it really works. Can’t wait to see some more discussion on it.
The one catch with this is that Pair Programming originally came from experts programming together. I’ve done it once and it was fantastic 🙂 Tim
@Caroline: thanks for your comment. I hope to do a followup post discussing the evidence. (As you suggest, I can’t very we’ll cast a theoretical doubt on pair programming without also discussing the evidence in practice). @Tim: When you did it, how did pairing fit (or not fit) your customary solo thought processes? I’m not saying it had to match, just interested in hearing from someone for whom pairing worked well, whether you found that you had to adopt different ways of thinking, and what the pros and cons of those ways were. FYI my own experience of pairing is, mostly, consistent with the issues suggested in this post. One partial exception was when I was paired with a business expert – in which case our complementary knowledge made the experience more pleasant and effective.
Hi John, It’s been my experience that two experts that do not share a common domain language do indeed encounter this slowing problem, at least until they’ve spent enough time in collaboration for each to develop an effective internal translator for the other. Do you have any thoughts on how this plays out into the shared cognition phenomenon? Cheers!
Jonathan, No, I’m not sure how that plays out into the shared cognition phenomenon. I guess I was hoping you might know 😉 … … or at least be prepared to take a guess. In the cases where you’ve seen it work well, how did the paired conversations of the experts compare with those of non-experts? Were the experts talking as often as non-experts, just at higher levels of abstraction? Did they need to form shared vocab about code/design in addition to the business domain? What other qualitative or quantitative differences did you see with the experienced expert pairs?
No time to write, but have had time to google: https://c2.com/cgi/wiki?PairProgramming The think with all the early agile research is that it was done with experts – so the question in the agile literature has always been along the lines of “will this scale to not-so-good people” 🙂 Tim
@Tim I’m guessing you mean the likes of the 1999 Laurie Williams study? It used students, not programmers with the 10 or so years of real-world experience needed to develop the kind of expertise I’m talking about here. Yes, they were advanced students, but still students. See https://alistair.cockburn.us/Costs+and+benefits+of+pair+programming
Great post. I find communicating why I hold certain opinions highly challenging. However I invariably learn a lot from the explanations I fabricate… It is almost as if being forced to explain your intuitions converts your unconscious beliefs into a series of principals that you can add to your arsenal of “truths” to help you explain the your otherwise arbitrary preferences.
I simply find the 3 types of pairings (novice-novice, expert-expert, novice-expert) to be different. Each can be extremely effective, but for different reasons. Ignoring the other two for the moment, the expert-expert pairing tends to be the one that most iteratively evolves the ubiquitous language (and thus design) of the team. It uncovers problems not in the code, but in the way we think about the system. Part of this is that the experts don’t want to get slowed down. So they start speaking in a rapid pidgin. I had a session the other day where the transcript was something like (A was driving): A: “Another tree transform?” B: “Visitor?” A: “Shouldn’t need so many.” (typing starts; implementing visitor on background thread while conversation continues). B: “Yeah; refactor later. Should request really be a tree?” A: “Point. Mostly not recursive.” B: “Point. And mostly use nmable parts at a time.” A: “Compiler? Have you heard of nanopass?” B: “Worth exploring. Recast as optimization problem.” (visitor finishes, with tests). B: “Nulls?” A: “Usual throwing guard. Exception path only.” A: “But handler is wrong.” B: (raises voice to catch attention of rest of team) “We’re seeing some problems in null-path exception handlers again. We should discuss what we want to do here. There’s a lot of duplication and we seem to keep making errors. See if you can refactor to a better pattern.” A: “Any ideas?” B: “No. Let them figure it out.” In that session we implemented a chunk of code, identified that we were over-using a particular design pattern (which indicated to us that we had a more fundamental design problem), and found some missing error logic (which indicated another systemic design flaw). We then raised one of those problems to the team’s attention and agreed (silently) to keep noodling on the other ourselves. This transcript covers a little less than 2 minutes of time. The airwaves were full about half of the time. Pairings of experts can work very well. But they have to stop explaining why to each other, and just state partial what. Assume th other expert gets the why, or can create his own. Don’t slow down the thinking to talk; shorten the talking to fit within the fast cycles and focus on the part that will challenge the other guy’s thinking.
Thanks Arlo. Very informative and thought-provoking. I have duplicated your last paragraph in the main body of the post, to make sure future readers won’t miss it.