A sports team has a play book, does your team? A sports team practices together, does your team? A sports team works as a unit, does your team?
Too many times I have see engineering teams as only a team on the org chart In reality they solve tickets as individuals with only a small interaction from pull requests. Otherwise they might as well not even know each other. They are a team not as in basketball or football, but like golf where once you get to the tee, it's you and only you to get the ball in the hole.
This is why I like XP. Their teams really are teams like you say.
Though I think in many dev shops you can be team like. Someone might like refactoring and cleanup. Someone else is good at rapid prototyping. Another architecture. Sometimes a great dev is just the one who can take the unglamorous tickets and get them done at a sustainable pace. Or someone who is good at devops, teaching, or morale building. Sometimes just communication.
Everyone has different strengths. No one would ever say "who's the best (American) football player?" because you'd have to ask "who's the best kicker, tight-end, defensive lineman". They are all different roles.
To think that football would have the level of awareness that it cannot be measured as a single dimension makes it sad and laughable that people reduce programming skill down to one most of the time.
The military often go beyond team training in a way that very few other organizations do, and conduct 'collective training' that involves multiple teams. Collective training itself has multiple levels - e.g. at the lowest level, two or more tank crews working together in a tactical task (e.g. when four tanks encounter an enemy, which one should engage it?), gradually adding other functions (e.g. infantry, artillery, etc) so that all of the different tactical 'trades' have formal training in how to work together. At these higher levels, the feedback and qualifications are aimed at the units rather than the individual soldiers.
The military are also conscious of group dynamics, for example the 'storming, norming, forming' that occurs when team membership changes, and the effect of 'churn' on a team, as individuals join and leave.
> A sports team has a play book, does your team? A sports team practices together, does your team? A sports team works as a unit, does your team?
It's a great analogy, but the author should also keep a couple things in mind:
- Out of all the football teams in the world, only 32 can ever win the SuperBowl
- Most of the money is in the SuperBowl (winner takes all)
- Professional athletes are extremely well coached and compensation is extremely competitive
- Professional athletes only train and play: there's support staff for everything else
How many times a golf player has demonstrated an idea that upturned the whole field? But this is common place in engineering because it is essentially a creative task.
> the effectiveness of productive effort
and
> the state or quality of producing something
Those are quoted from the dictionary definition of productivity, and that definition in my opinion outlines a great insight. Productivity is about the "product" first and foremost.
One thing that's often missing is teams don't have ways to quantify how good the product itself is. Most teams will instead pivot to trying to measure their rate of change to the product. This doesn't mean you have become more productive, because software product is not like food production, more of it is not always better. Better software will instead be about it being more ingenious, more intuitive, more tailored to the problems of its users, more responsive, with fewer malfunctions, etc.
So to me, this whole thing of trying to measure "productivity" which disregards the product from the equation is incomplete. It's trying to measure developer efficiency at changing things, without care if the changes are for better or worse. This includes things like measuring velocity, or line of code, or ticket closed. But it also includes what this article proposes, number of meetings, time to complete code reviews, developer satisfaction with their tools, etc.
All these are trying to see how quickly can developer make changes without ever measuring if the change produces a better product, thus if it actually made the team more productive at effectively positively improving the product.
I'd like to hear about ways to measure software product state and quality. If we had those, you'd have an easy way to know how productive a team would be by seeing how quickly they can improve the product.
This always takes me to USE YOUR PRODUCT. If you don't use your product, you'll never know if it is good. My team builds internal tools for our support folks, we also use these tools every day to try and answer our own questions about internal problems. Are the tools we make getting better? Well, just consult our "How long does it take to answer X?" KPI, we have a set of known common problems; if our support folks are spending less time figuring out the answer to these gold standards, our product is improving, if they're spending longer, we've regressed and we need to change something.
I'll grant that making internal tools grants us a lot of gimmes, we're not tasked with taking advantage of users, and we can spend time training users intensively on new releases, but the underlying principle is the same; a chef will never know if a recipe is good without tasting it himself, and you'll never know if your team is successful if you don't know if the product you've created is good.
This is a bit absolutist. Sure, use your product when it makes sense like in your case but many people here build things for user groups they don't belong to (constantly).
Besides: are you hitting all the same use cases your users are and at the same rate?
This advice always sounds nice but it's impossible to apply for lots of people. More generically applicable advice is that you should be acutely aware of your users real(!) experiences. This can be by using the product or by making sure the team sees and hears raw user feedback. Preferably combined with data that helps prioritize.
There's nothing like hearing the frustration in someone's voice when they are trying to accomplish something and _your product_ is holding them back. (well, except for experiencing it yourself and then we've circled back ;) )
Measuring productivity has to start with what matters: do your end-users (these are not necessarily your customers) like what you do? And often it is hard to get this information. Just asking them will lead to biased responses, but there are ways to deal with that.
I happen to be in a business (let's call it food) where I can observe the end user using more of their money with our customers rather than their competitors when we do things right. That's a strong signal -- probably stronger than asking them -- so a very fortunate starting point. (Of course, there are still confounding terms like seasonality, economy etc to grapple with, but there are ways to deal with that too.)
Starting with that one measurement that matters, one can begin exploring proxies. Set up a hypothesis: "Velocity would be easier to measure and the number would be available faster. Does it correlate with our one good metric?" And then you run the experiment. It might take weeks or months to get back the end-user-happiness data that corresponds to this week's velocity, so these tests are expensive (but pay off many times over when you find good proxies.
In the end, you should be able to construct a somewhat sensible model of user happiness, and answer questions like, "if we hire another team member and therefore increase velocity by 3 %, how much happier will our users be? And what is that worth in sales?"
When you can convert everything to the same unit of measurement (dollars are an easily explained option, but log-dollars are a personal favourite of mine) you get intense clarity and alignment around priorities and decisions.
----
All of that said, development speed, as defined in the Accelerate study in particular, is one of those generally good things you pretty much unconditionally want. The reason is given in the study and expanded on further in Reinertsen's Principles of Product Development.
The reason speed is important is that successful product development is controlled by surprises. You will discover something tomorrow that will make you wish you had prioritised differently today, and being able to pivot quickly on surprises is how you both de-fang the biggest risks, but also how you throw yourself at opportunities before your competition even realises there is one.
A bit too much hyperbole for my taste, given the less than groundbreaking ideas.
Consider -
I have two teams, with the same staffing levels and the same general seniority. For this example, lets assume each is a team of 5, with 1 tech lead, 2 seniors, and 2 juniors.
Both teams have the same approximate meeting count, both work on the same stack with the same dev tools.
Team A consistently releases new features faster than team B. Why?
Because if the answer is "Find the blocker" aren't we right back at
> "your engineering leaders will simply justify failures, telling stories like "The customer didn't give us the right requirements" or "We were surprised by unexpected vacations.""
except with blockers this time?
Maybe Team A is actually just better than Team B.
Maybe Team B is actually working on a feature set that has more inherent complexity.
Maybe Team A releases faster but also has more incidents in prod.
Maybe Team B releases a larger changeset on average.
None of this is getting addressed or answered.
----
None of that is to say that measuring blockers isn't a useful idea, but it's certainly not some silver bullet.
So if someone says "it's because we're blocked on [slow delivery of designs from another team]" and you measure that specifically, and then improve it, and you notice the team's output hasn't changed, you've learned something.
I've certainly seen those reasons before, but haven't seen people turn them into specifically measured things versus "ok let's see if we can improve it" with often little or ineffective followup.
From that you can get measurements on how long each stage takes and the duration of each transition.
From there you can compare team A and B. The transition times is where the human time cost usually sits.
Just getting the time when a Jira or feature is raised, to the time it is picked up to the time of the first commit to the time of the first test and final build does already give you valuable insight.
The points you raised towards the end can be answered if observability of your CI/CD pipeline is actually in place or at least a place to start a line of inquiry.
Naturally the blockers will be aggregated into some of the values but as you work through the journey, they will start clustering at certain stages and maybe highlight a significant problem that needs to be addressed.
There's a wealth of data being left on the table that can help inform management decisions.
I think you have to measure with the intent to improve how your team works. If a manager can measure at the team level and open up visibility into the development process, they can hopefully find where things get frustrating (ex. waiting for someone to review a PR).
That said, there seem to be more mature tools out there than OKAY's beta. There's a discord server called dev interrupted that talks about this stuff a lot.
If you look at the busfactor section for the vscode and gitlab repository
https://imgur.com/NfgvvTy (vscode)
https://imgur.com/DK7rvfx (gitlab)
You'll find they both have a large cluster of developers in zone 2. For developers to exist in zone 2, they have to have medium to high impact on the code that they worked on, but not clash with others. If you look at the vuejs-next repository
https://imgur.com/eDAOyPW (vuejs)
You can see it's actually a pretty fragile project, since Evan is responsible for pretty much everything.
Based on what I've observed by studying successful open source projects, you actually want to discourage "very high impact" employees, since they introduce knowledge risk.
Edit: The metrics that I'm showing is limited to the last 90 days for Typescript, Javascript and CSS code.
This is a dangerous conclusion, especially for a business. If you're in pure maintenance mode, maybe... but otherwise... You want people who can pitch in anywhere, who can fix things rapidly, who can build new solutions quickly when required, and who know your business inside and out.
You just don't want knowledge siloed there, so you want to make sure other people are also on the path to being expert on the various areas.
A study from 2016 at Google discovered that team effectiveness is related to the opportunity for "equal speaking". Similar to your conclusion.
https://www.nytimes.com/2016/02/28/magazine/what-google-lear...
If you're trying to sell this concept to developers, you might want to change "discourage" to "hire more than one"
Maybe Evan is a liability, but he might also be the reason why VueJS is popular and valued in the first place.
It also seems pretty strange to me to count VSCode and GitLab in there, because those are worked on by companies with teams behind them that get paid and which will have constant churn.
If you took VueJS, and had a team at Microsoft take it over from Evan, it too would live on. So I don't know that your explanation for the "risk" here has anything to do with "very high impact" and more to do with a project being maintained by a community of backers, on people's free time, and a project maintained by a company that hires developers to work on it full time.
However most organizations would pile more work onto devs if they finish early, so devs then compensate by taking more time on their current tasks. Why finish them early, when you'll be thrown another one right away.
Prod can be running as smoothly as you like - it doesn't matter if your company is having its lunch eaten by competitors adding or improving features faster than you.
You should be incentivizing employees to step back and look at the big picture every now and then. If customers are happy, the company is profitable, and prod is running smoothly? Go take some time off, head out early, etc.
Is revenue down? Company growth slowing, or worse, is the company shrinking? Time to work. Prod being red or green has little to do with it.
Speaking as a dev, the best way to motivate me is to give me some head cracking challenge, trust (no reporting) and autonomy (don’t tell me how to do my work).
The challenge is whether the organization prioritizes robust systems and devotes resources to making things more observable, reliable, and resilient. The team can want with all their heart to engineer fixes to common failure modes but if the decision makers are always pushing full steam ahead on new features it can be really difficult to improve the app.
The only metrics that will matter are “did you get the stuff done that we wanted?” (where “stuff we wanted” is defined vaguely and the meaning shifts to suit product leaders ever changing political landscape), “did you get it done really fast?” (where “fast” is vaguely defined according to product political circumstances and never puts any weight on engineering estimates, staffing needs, or resource limits) and “did you get it done cheaply?” (where “cheap” is defined vaguely based on various internal politics and budget turf wars as well as larger company financials - and when the money is good and no one looks too hard at this, it allows sweeping other issues under the rug, no one cares if you burnt a quarter working on the wrong problems).
Assessing effectiveness, in principle, is purely a political concept that operates from the top down.
This has been true in every company I’ve worked for or knew a colleague or friend who worked there - from tiny “lean” startups to extreme cultures like Bridgewater to every mid-sized or large tech company.
And we came full circle back to measuring a teams productivity being an art form. Maybe you have a metric now: Hours spent in meetings per week, but nobody knows how much is too much and how little is too little. How do you measure the impact of more meeting on productivity? Or the impact of less meeting? If your measure of productivity is "time in meeting"? This is a circular dependency.
Measuring process inputs will favor Sisyphean work. Appearing to be working, hours punched in and butts-in-seats will be more valued than results. Many companies are stuck in here.
Measuring proxies of outputs such as lines of code, or number of tickets closed, as the article mentions, only leads to people gaming the system.
Measuring the actual value delivered is obviously the best thing to do. However it is often difficult to even define value and the amount of it created. Which is always a problem especially for inhouse projects or in the absence of direct contact with the market.
I like what the article suggests - measuring process flow instead of measuring inputs or outputs. Can't say exactly why I like it but it seems that engineer types are naturally motivated to perform and learn, so giving them enough space is a good way to get working systems as a side effect.
On one extreme there is no shield around engineers and they experience constant whiplash with "code oracling", shifting priorities, obscure trivialities, and other things that can drive people insane and can prevent meaningful work from being completed.
On the other hand, sticking perfectly to "we're committed to this sprint and unless its on fire you wait in line" and much of your business becomes unnecessarily rigid and painful. That, IMHO, is much worse.
Finding a balance is crucial, and to me thats just as interesting of a datapoint.
Congrats, according to Google it seems like you invented a new word. Could you define it for us?
If it's an indication that something is wrong with the team, I think that's still helpful. For instance, if the team does not have the right experience and are often blocked waiting for someone else's answer to a question, it might make sense to work on team composition. It's not about blaming people or teams, but about improving flow.
(Assuming an org that's making a genuine attempt to improve, versus just a cynical blame game. But in the latter case... you're screwed sooner or later anyway.)
The big issue it has, IMO, (which other DeMarco books like 'Slack' share) is that it does not provide actual case studies that would convince a typical executive / manager working in a high pressure / competitive environment to change their methods.
Along with Mythical Man-Month, it's clear that we didn't need any other management books.
That's a HUGE if right there. In my experience, the most problematic teams I've ever worked with were simply full of incompetent and/or unmotivated engineers, and for whatever reason it was impossible to replace them.
So while it’s good to remove obstacles and maybe helpful to measure if obstacles exist that’s bad, it doesn’t mean much when no obstacles exist and none have been removed in a long time. Maybe they are super awesome and chugging along from all those removed obstacles. Maybe they are super sucky because they don’t see things as obstacles.
Engineering activities must make sense as economic activities.
Engineering activities productivity can be measured the same way the productivity can be measured for any economic activities.
See "Internal Market Economics: Practical Resource-Governance Processes Based on Principles We All Believe In", Book by N. Dean Meyer.
The posted article came so close to the idea of the right measurements, I was intrigued and was like "yes, say it!", but the author didn't say what I expected, instead they pointed at quantifying calendars and commit logs. What a bizzare self-contradiction.
The book I referred to answers the "but engineering team doesn't have revenue".
Engineering can be art, sure, and if you like, we can even discount the idea that art creation can be seen as economic activity. But is there a need to measure productivity then?
The economy is basically a population based evolutionary process. Its nature is open ended - things that seem important now might have been seen as being useless or stupid a decade or two ago, like neural nets in 1985 and personal computing in 1950's. So we can't tell ahead of time which will become important, but the exploration process and extreme diversity are the main gain. It's all about getting to those stepping stones that we don't even know we will need. Inspired engineering is eventually rewarded with economic success, just like biological evolution.
Nothing is happening without funding.
Business can't run producing things which are not valuable in the moment, it will go bankrupt.
Government can sponsor such things as it sees fit, best if guided by a clear hypothesis that funding it makes the outcomes probabilistically better than not funding it.
When nobody is paying you but you're tinkering with some stuff, then it's you who is sponsoring it with your time and other resources.
> Productivity in engineering therefore naturally increases when you remove the blockers getting in the way of your team.
The guy just discovered the job of the PM or what ?
If I had to choose, I’d choose no metric over lines of code or number of tickets, etc.
I think the issue is that when orgs get big enough they have lots of teams and having some comparability is useful to find lessons learned to spread among teams, etc. I don’t know of any metric that is truly objective for figuring out high productivity teams or individuals just by using it.
I think there are some “vital signs” that you want projects to have, but don’t want to fixate on the actual value. Like you don’t care if someone’s pulse is 70 or 80 or 90, but you want to make sure pulse is checked.
I think having reviews in git is helpful and not having any is something to look into.
I think having contributions from other teams is a good sign, although absence isn’t necessarily bad. I think the positive is by showing how others are finding and asking questions or reviewing or contributing material so that’s probably reuse.
I think encouraging information sharing through lunch and learn presentations is good, but is delicate to avoid gaming from people just “making the circuit.”
Having an automated CI/CD is a good sign and if code is making it to prod without one, it requires looking into.
Theoretically a healthy team will have all these signs, but you could have an awesome team with low numbers and a terrible team with high numbers. So these metrics wouldn’t be useful for detecting productivity and comparing across teams, but would be good for just finding big, lurking problems.
I comically work in an org where there are whole teams not using source control, so the “only I can measure myself, give me more money” is a very real challenge.
A team reflecting on progress at regular intervals will naturally bring up processes, meetings, tools, etc, and a manager or team lead can easily add these questions into the mix for reflection and/or discussion as part of this process.
The key distinction from this seems to be hidden in the line "Finally, turn all these questions into metrics" - but the article could definitely do a better job of highlighting the differences between their "solution" and retrospectives (as well as more general good management practices like talking to your team).
There are some interesting ideas in there that seem tied to their product (https://www.okayhq.com), but the article doesn't really progress far enough past the high level ideas to be practical and rather a lot is left as an exercise for the reader!
I prefer to measure value created and partially tie it to compensation, directly and indirectly.
For a dev team for a trading platform, we had an index of income created by product, approved by team and stakeholders, that affected total compensation paid to team.
This is a specific example. The general principle is let tech teams make decisions and compensate them for value created, at least partially.
Otherwise, when you don't want to share your money, "measurement of productivity / effectiveness" comes in. Because when you measure money, people ask, "why am I not getting more when I make you more"? But if you're not measuring money, why do a business?
Person A has an interesting-but-not-exceptional idea and gets Person B to fund it. People C, D, and E code it.
For most interesting-but-not-exceptional ideas you could replace all of these people with others.
If you actually ran this an experiment in a Monte Carlo-but-real kind of a way, you'd find some collaborations would do well, some would do badly, some would fail completely, a few might explode (in a good way).
How do you quantify the value of the relative contributions?
How shall we quantify value, and how much of that index will affect compensation.
It's answering the question "why do some people get paid more" in a transparent way.
How does this apply to team that maintains your source control and continuous integration systems?
Most of the ideas predate software as well. People had remarkably complex paper based sales systems.
I think the difference today that the author referenced is that with Salesforce stuff like number of leads, contacts, conversion, revenue and profit per conversion, etc is all captured and reportable in sort of real-time. There’s still lots of gut inputs like how the meeting went or having people estimate probability of closure, etc. But now it’s more systematic.
I don’t know if it’s more successful, but it seems like I get more sales calls and repeat contact attempts.
If there are more blockers, is the team doing better (because they're covering ground faster and finding new blockers faster). Or worse (because they're being blocked more)?
If I have to report to the CEO on the Dev team productivity, do I tell them how many blockers we removed?
It’s turtles all the way up.
I actually think that one’s compensation is correlated to how ambiguous the job duties are and how hard it is to measure success, given employment.
Stuff that’s easy to measure gets commoditized. Stuff that’s hard to measure, but still important is hard to staff, creates risk and is easy to mitigate with money.
E.g.
Deployment frequency
Lead time for changes
MTTR
Change failure rate
A quick google reveals a fair amount of existing material :
https://leadingagileteams.com/2020/04/07/forget-dumb-product...
It has failed every time.
When the latest management fad fails to live up to expectations, new terminology are created to hide the failed management fad. New terminology simply replaces the old terminology and they can now sell new and improved management concepts again and again.
Software Engineering is extremely difficult. That's why successful Software Engineering companies are highly valued as unicorns. The companies have temporarily captured the right type of people tackling right type of problems and delivered right type of Software Engineering solutions. The process can not be replicated. That's why it's so valuable and worth Trillions of dollars.
There's no other Microsoft. There's no other Apple. There's no other Google.
These are unique companies with unique products that provide value to customers. It took years and decades of Software Engineering man hours to iterate until they delivered the valuable software solutions.
Lot of these Software Engineering management fads are trying to capture something that does not exist.
Productive Software Engineers are exceptional. Sooner Software Engineers understand that paradigm, the better they will be able to value themselves and be more effective.
The point is learning from those who've succeeded before you, instead of claiming it's just magical chemistry happening. Or worse, starting again from first principles.
No, that won't produce product-market fit, nor does it open up once in a lifetime opportunities, but it allows teams to execute well. The belief that productive software engineers are unicorns, however, and need to be valued above all else? That's what destroys companies.
Google paid Anthony Levandowski $120 Million to lead their autonomous car project, before he went to Uber and the legal troubles. Uber was paying him more than Google.
Facebook paid $16 Billion for What's App, which only had around 50 engineers at the time. Facebook was buying the IP and the productive software engineering members of the team.
There are countless other examples.
No amount of management process can create productive software engineers. That's why companies recruit productive software engineers from other companies. That's why companies buy other companies that has productive software engineers with proven products.
Productive Software Engineers are exceptional. Their work are worth Trillions of dollars and generates Billions of dollars in revenue every year.
In economics as far as I know, they measure productivity via salaries. The more money people make, the more productive they are assumed to be!
'DP=(GDP/Hours)*(Hours), (1)
where Hours is the total number of worker-hours.
Chart 1 depicts changes in each of these components over time. For the entire 1961-to-2012 period, labour productivity advanced at a 1.9% annual average, accounting for slightly more than half of the increase in GDP growth. The rest is attributed to hours, which increased at 1.5% per year on average.
Aggregate GDP measures the returns to both labour and capital. Distributional concerns lead to questions about whether the share going to labour increases over time and, in particular, how productivity growth is related to real income.'
https://www150.statcan.gc.ca/n1/pub/15-206-x/15-206-x2014038...