Devops Book Review: Team Topologies
Here are my insights from reading the team topologies book:
* I never knew that organizational structures mirrored communication paths! Interesting revelation!! This is based on Conway’s law. For example our QE team was reporting to a Head of Quality who in turn reported to Head of TPM. Then we had a reorg and now the QE team was moved under Head of Infra. Applying Conway’s law, Infrastructure team is more of a enabling team and that is similar to what QEs do - they enable developers to produce better quality products!
Have you ever wondered why teams go through rounds of reorgs? Reorgs within companies happen to improve team interaction modes and the 3 main interaction modes are:
* X-as-a service
QEs are either embedded within feature teams or centralized in some companies.
These are not necessarily related to company culture alone but due to the following 4 fundamental topologies:
* Stream aligned team: Teams aligned with the flow of business. Eg. We have rider and driver onboarding teams, driver insurance & safety, etc. Similarly our QEs are embedded within these teams to aid with the flow of business.
* Platform team: Teams that work on the underlying platform supporting the stream aligned teams. Eg. Data Science team that provides data to the driver insurance & safety team to analyze and ensure a specific driver is considered “safe”.
* Enabling team: A team that assists other teams in adopting and modifying software as part of a transition or onboarding. Eg. Usually Infrastructure teams provide the underlying framework that is being leveraged by the internal customers aka developers.
* Complicated subsystem team: A team with a special remit for a subsystem that is too complicated to be dealt with by a normal stream aligned or platform team. Eg. We have a CI team comprising of 3 people which is a subteam under Infra org.
For effective delivery, always start with the team. A few parameters to consider:
* Team size
* lifespan of team
* relationships within teams
* Team cognition
* Organizational groupings typically follow the Dunbar’s number beginning with 5-8 people then increasing to 15, then 50, then 150, 500 and so on.
Cognitive load is a much overlooked entity. As a junior engineer, I have always found it hard to say "no" to any new assignment until one of my mentors advised me that I can't grow into a senior engineer unless I learn it. That paved way for me to become more of a data driven and results oriented leader. The book outlines the following cognitive load:
- Intrinsic Cognitive Load: This relates to the skill of how to do a task. For example: being able to test a given application
- Extraneous Cognitive Load: This relates to broader environmental knowledge, not related to the specific skill required for the task but still necessary. Eg. Being able to run your automated tests in CI, debugging issues along the way.
- Germane Cognitive Load: This relates to big picture thinking - how to make your work as effective as possible. Eg. Where does your team fit in; how does your team improve the overall business; how can you bring more value.
Another interesting concept I came across was about the Spotify model for organizational design - Spotify organized their teams for a rapid flow of change using semi-autonomous Squads, Chapters of similar-skilled people, Tribes of related Squads working on similar product areas, and Guilds that act as cross-tribe communities. This model helps to emphasize a rapid flow of change because each Squad has a mix of skills – engineers, testers, business analysts, ops people, etc. – that allows the Squad to own a slice of the Spotify product and take an idea from concept through coding to Production. The software architecture also benefits because different business domain concepts are neatly separated into bounded contexts mapped to different Squads, which helps with a fast flow of change.
The book outlines how other organizations have tried to emulate the Spotify model but often end up with mixed results and limited success. Why would that be?
Limitations of the Spotify model
Although the Spotify model is a useful starting point, it does not directly address several key aspects of modern software development, including:
* The size of software owned by (or assigned to) each team and the cognitive load on the team
* How the organization and software will be affected by Conway’s Law
* Patterns and models for team interactions
* Triggers to tell us when to change and evolve the organization structure
Team topologies is not static and can change based on the situation. Multiple factors that are discussed above come into play when deciding which of the topologies and interaction modes fit the needs of the organization at a given point in time! However following team topologies alone will not produce success, companies need to maintain a healthy organizational culture, good engineering practices, healthy financial practices and clarity of business vision.
One last takeaway based on what Tristan mentioned in our meeting: agree with your team on a slack status (3 tiers modeled after spicy levels of pepper🌶) based on your cognitive load for the day and use it as a non-verbal cue for your team to understand how they can support you better. Simple, yet powerful!