One key distinction that is easily lost, but quite nuanced, is that the original article is titled “Can You Measure Software Developer Productivity?” They are specifically referring to an individual’s productivity, and not a team’s. I often see a team’s productivity given the term “velocity” (which is often connected with agile sprints and story points, another game-able system to be sure). I want to dig into this a bit more, because the scope is as important as the measure.
I’ve been managing engineering teams of 2-5 people for some time now, and have often struggled with this. To connect it to the WBR idea of one divisions output metric could be another’s input metric, one individuals output metric is a teams input metric. So I think taking account the shift in the mixture between individuals and teams as we move through Beck’s model is important. Here’s my stab at it.
Effort — simply an individual’s time spent working on, figuring out, and solving/coding a particular engineering problem or task.
Output — number of tasks completed (overly simplified), which is strongly correlated to number of PRs. This can roughly be equated to effort * individual talent/skill/intelligence. (We’ve all seen incompetent developers take 1 week what a good dev can do in a few hours).
Outcome — now we get into team dynamics. This can be thought of as the sum of all developer output * some product accuracy factor. That is, the somewhat unknown ability for the given set of output to produce a desired behavior. Note, this value can be negative if the product team is wrong about the feature’s outcome!
Impact — this is even trickier as it’s now the combination of the entire org’s efforts towards this impact goal, times again some customer-psychological unknown return factor (does engagement with a certain feature really reduce churn?)
To get really mathy to belabor the point, and ignoring potentially crushing external market factors (ie “friction”):
Impact = (A’ + O) * R
where A’ are the other outcomes achieved by other departments in the org, and O are outcomes from the Eng team, and R is an uncertain “return” on the outcomes in terms of business value.
O = V * P
Where V is the development team velocity. And P is the product accuracy factor, which can be negative.
V = SUM(C)
Velocity is the sum of all developers contributions/output.
Ci = Ei * Ti
Where Ci is an individuals contributions measured as their individual effort times individual talent.
So factoring it all out:
Impact = ((SUM(Ei * Ti) * P) + A’) * R
For a 3 person team, that might be:
E1 * T1 * PR + E2 * T2 * PR + E3 * T3 * PR + A’R
Assuming T is relatively fixed, P and R will not be known until after the fact (and based often on luck, “product taste” and “business knowledge” like WBR provides), and A’ you have little control over: tell me which numbers you are inclined to measure to get at individual performance? Probably E * T, because since your devs aren’t clocking in, and T is fixed, output is at least observable. Now we see why this is so tempting.
So we see how the unknowns kind of compound, and adding teams in there muddy the picture even more.
But my contrarian take here is that measuring individual output might not actually be a bad thing. Consider that I can tell a developer he needs to work on Project X. It’s a high risk, experimental project. His contributions are very high, the quality of the output is very high, but our customers hate it and we shelve it. Is that developers productivity negative? Hard to argue that one.
I’d also argue that output is a leading indicator for the other parts of the model. All things equal, prolific quality output almost always leads to outcome and impact over the long range. @cedric , you yourself have a high rate of prolific quality output. But how does that separate you from an AI content farming? The key here is quality. You can only assess quality by observing the work.
So this brings me to my next point that good engineering leads should encourage high-quality output, and often. And then it’s up to leadership to take care of the Ps, Rs, and collaborate with other functions to encourage the A’. (note: one indication of senior engineering level is how much autonomy and skill they shift their output to meet impact, or head off incorrect assumptions early, etc).
It’s the blocking and tackling. It’s the military drill. It might be the wrong hill to take, but dammit are we going to take it swiftly and mercilessly.
In fact I think the military analogy here is apt (I’ve never been in the military, but read a lot about battlefield leadership and war memoirs). The general is judged on the battle strategy (outcome) that leads to winning the war (impact). Every individual soldier knows their mission, but they might be tasked with manning artillery or holding a machine gunner position. A good soldier will hold that position, shoot with skill, and show a high degree of courage and teamwork. But a great soldier will see when the plan has hit a hitch and what he can best do to still accomplish the mission.
Anyway, any productive takeaways from this ramble? I think that the execs should specify a desired impact and a given time and risk appetite, then product teams should know what levers of outcome they have to drive that impact. Then the product team is measured that way. Then inside the product team, the tech lead who understands and demands prolific quality will measure output quantitatively and quality qualitatively. (NB: how to “measure” quality? Not sure. A good low-debt design should be rewarded/measured somehow). And individuals can be trusted to take care of their own effort (the idea of a “Results Oriented Work Environment”). The tech lead also needs to put the output metric in the context of the product phase: discovery/prototype is different from development which is different from maintenance. Finally they also need to understand the level of difficulty or effort in each PR.
So point being, I think output should be measured, but we need some way for a strong tech lead to contextualize that output in terms of phase, complexity, and quality. How would this be communicated to non-technical execs? Or be put under SPC? Need to noodle on this one, but anyone have some ideas here?