The Limits of Planning Poker

As an exercise, planning poker can be quite useful in instances where no prior method or process existed for estimating levels of effort. Problems arise when organizations don’t modify the process to suite the project, the composition of the team, or the organization.

The most common team composition for these these types of sizing efforts have involved technical areas – developers and UX designers – with less influence from strategists, instructional designers, quality assurance, and content developers. With a high degree for functional overlap, consensus on an estimated level of effort is easier to achieve.

As the estimating team begins to include more functional groups, the overlap decreases. This tends to increase the frequency of  back-and-forth between functional groups pressing for a different size (usually larger) based on their domain of expertise. This is good for group awareness of overall project scope, however, it can extend the time needed for consensus as individuals may feel the need to press for a larger size so as not to paint themselves into a commitment corner.

Additionally, when a more diverse set of functional groups are included in the estimation exercise, it become important to captured the size votes from the individual functional domains while driving the overall exercise based on the group consensus. Doing so means the organization can collect a more granular set of data useful for future sizing estimates by more accurately matching and comparing, for example, the technical vs support material vs. media development efforts between projects. This may also minimize the desire by some participants to press harder for any estimates padded to allow for risks from doubt and uncertainty, knowing that it will be captured somewhere.

Finally, when communicating estimates to clients or after the project has moved into active development, product owners and project managers can better unpack why a particular estimate was determined to be a particular size. While the overall project (or a component of the project) may have been given a score of 95 on a scale of 100, for example, a manager can look back on the vote and see that the development effort dominated the vote whereas content editors may have voted a size estimate of 40. This might also influence how manager negotiate timelines for (internal and external) resource requirements.


Photo by Aditya Chinchure on Unsplash

Achieving 10x

Crossed paths with an old but still relevant conversation thread on Slashdot asking “What practices impede developers’ productivity?” The conversation is in response to an excellent post by Steve McConnell addressing productivity variations among software developers and teams and the origin of “10x” – that is, the observation noted in the wild of “10-fold differences in productivity and quality between different programmers with the same levels of experience and also between different teams working within the same industries.”

The Slashdot conversation has two main themes, one focuses fundamentally on communication: “good” meetings, “bad” meetings, the time of day meetings are held, status reports by email – good, status reports by email – bad, interruptions for status reports, perceptions of productivity among non-technical coworkers and managers, unclear development goals, unclear development assignments, unclear deliverables, too much documentation, to little documentation, poor requirements.

A second theme in the conversation is reflected in what systems dynamics calls “shifting the burden”: individuals or departments that do not need to shoulder the financial burden of holding repetitively unproductive meetings involving developers, arrogant developers who believe they are beholding to none, the failure to run high quality meetings, code fast and leave thorough testing for QA, reliance on tools to track and enhance productivity (and then blaming them when they fail), and, again, poor requirements.

These are all legitimate problems. And considered as a whole, they defy strategic interventions to resolve. The better resolutions are more tactical in nature and rely on the quality of leadership experience in the management ranks. How good are they at 1) assessing the various levels of skill among their developers and 2) combining those skills to achieve a particular outcome? There is a strong tendency, particularly among managers with little or no development experience, to consider developers as a single complete package. That is, every developer should be able to write new code, maintain existing code (theirs and others), debug any code, test, and document. And as a consequence, developers should be interchangeable.

This is rarely the case. I can recall an instance where a developer, I’ll call him Dan, was transferred into a group for which I was the technical lead. The principle product for this group had reached maturity and as a consequence was beginning to become the dumping ground for developers who were not performing well on projects requiring new code solutions. Dan was one of these. He could barely write new code that ran consistently and reliably on his own development box. But what I discovered is that he had a tenacity and technical acuity for debugging existing code.

Dan excelled at this and thrived when this became the sole area of his involvement in the project. His confidence and respect among his peers grew as he developed a reputation for being able to ferret out particularly nasty bugs. Then management moved him back into code development where he began to slide backward. I don’t know what happened to him after that.

Most developers I’ve known have had the experience of working with a 10x developer, someone with a level of technical expertise and productivity that is undeniable, a complete package. I certainly have. I’ve also had the pleasure of managing several. Yet how many 10x specialists have gone underutilized because management was unable to correctly assess their skills and assign them tasks that match their skills?

Most of the communication issues and shifting the burden behaviors identified in the Slashdot conversation are symptomatic of management’s unrealistic expectations of relative skill levels among developers and their inability to assess and leverage the skills that exist within their teams.


Image by alan9187 from Pixabay