Can you Sprint when there are a lot of unknowns?

Over the past few months I have spoken with people from a number of teams who have found themselves facing a large amount of uncertainty in their mission, their precise roadmap has been unclear, and the technical details of how to achieve it foggier still.

This is a common scenario for teams, particularly in start-ups where there is a significant rate of development (or growth) and a large amount of risk and uncertainty in the roadmap ahead.

Often, I hear these teams say that they should use Kanban because planning is difficult with such uncertainty. I can understand the logic and thinking that leads people to this conclusion, but I'd like to challenge it:

One of the rules of Scrum is that in each Sprint the team produces a potentially releasable increment [of product]. To achieve the Sprint Goal teams typically implement Product Backlog Items, but there is no requirement that PBIs be the only work that a team can complete, nor even that they are completed at all in achieving a Sprint Goal. It would be unusual, but conceptually a Sprint Goal could be achieved without a single PBI being done.

If we contrast this to a Kanban or continuous flow approach where the focus shifts towards progressing individual cards rather than the greater goal, we can start to see that Scrum is in fact ideally suited for environments where there are a lot of unknowns, and a lot of uncertainty.

If Kanban teams try and optimise their flow as you would expect a team practicing continuous flow to do (since WIP, cycle times and lead times would be key metrics), they will soon struggle as a consequence of the uncertainty and unknowns; it is difficult to achieve 'flow' when it's not clear what work needs to be done, or indeed how work can be done.

This leads to an interesting conclusion that would be echoed by many leading Scrum Masters worldwide; Scrum is ideally suited and intended for environments that are complex, and therefore have many unknowns. This is in stark contrast to the common view that scope is fixed within a Sprint.

It is this latter view that I believe leads many to defer to Kanban and continuous flow when there are many unknowns. The irony is that this leads teams to consistently choose a way of working that isn't suited to the problem space that they are in.

There is, however, a further level of uncertainty where not even a Sprint Goal can allow for the flexibility in scope required. This is particularly rife where there is no concept of a product, yet and so there's a lot of experimentation and trial-and-error in trying to build something that gains traction in a market.

Product Owners, engineering decisions, and 'technical debt'

When you’ve got a software product, it is composed solely of items like the code, static assets, supporting architecture and infrastructure (hereafter referred to as 'the tech'). These items are often thought of as being a necessary part of the product, but really they are the product and trying to separate them is a logical fallacy similar to trying to separate the food from a meal.

I think it's worth pausing and thinking about that for a second because it is an important concept to understand and embrace: in a software product the "tech" is the "product", and vice versa.

Things that are truly good for the tech are ultimately good for the product, and for the product to succeed the tech must itself succeed.

So why then, do team continue to treat the two as entities that are somehow distinct? Even the notion of things like 'tech debt' shouldn't exist. Let's explore that…

Of course code, architecture, infrastructure, etc. matter but not for the sake of themselves; none of these things can stand alone. If things are unstable, unscalable, or difficult/slow to update, that directly impacts the product value that can be delivered. ‘Technical debt' therefore, is more accurately labelled as ‘product debt', or even just 'debt'. Technical issues that have no impact over the product (or the future development of the Product) are completely irrelevant. There are surely no exceptions to this.

If a codebase requires refactoring because making changes is error prone and therefore slow, this has direct influence over the product value that can be delivered. If issues such as this go unaddressed, the rate of product value that can be delivered will decrease (probably exponentially) over time. If product value cannot be delivered, that's a product concern first and foremost.

It’s probably not for the Product Owner to specify specific implementation details, but they do have a very significant stake and interest in how their product is built since the decisions made can have significant influence over the success and possible speed of iteration.

A common over-simplification is that the Product Owner has responsibility over “what” and the Development Team have responsibility over “how”, and while this may be true at a very high level, The Scrum Guide reminds us that "The Product Owner is responsible for maximising the value of the product and the work of the Development Team". If architectural and 'debt' style tradeoffs will influence the delivery of value, the Product Owner should be part of such decisions. They may or may not have the technical knowledge (or the interest) to know whether a Product should be written in Python or C#, but they may care about the advantages of each. They may also care whether a new feature is implemented in a generic way, or hard-coded, especially if there is a big difference in implementation time.

I think that many of these issues have roots in legacy ways of working where there was a more distinct divide between Product Management and Engineering disciplines, and where product changes were captured in lengthy and detailed specification documents. In the agile world, where we’ve come to value things like valuable working software, and individuals and interactions we can do a lot better than handoffs of specifications. We can communicate, collaborate and negotiate with honesty about different solutions and the pros and cons that they bring because we’re now one team with one goal.

We must never lose sight of the importance of good technical practices, but only in so far that they can support the product. "Being agile" is never an excuse for cutting corners, but it does challenge our long-held beliefs about how teams can work together to satisfy our users and customers through early and continuous delivery of valuable software.

Scaling Scrum in a very simple way

There are a lot of scaled agile frameworks out there. Some are big and complicated, whilst others are small and simple. Each one has pros and cons depending on a wide variety of factors, not least the maturity of the company and the knowledge/experience/skill of the people implementing it.

One framework in particular stands out to me as being universally applicable and useful across a wide variety of teams — Nexus. Anybody who is interested in scaling an agile process would benefit from reading The Nexus Guide, even if it’s just for some ideas and inspiration.

Nexus is the foundation of the diagram below, and at its heart is a balance of simplicity and process definition. Quite intentionally, the diagram only shows the key flows of information & product throughout the process, because while the concept is highly universal and applicable across many teams, specific details seldom are.

I would generally expect the practices that have worked well for a Scrum team to work well at scale in this model. Of course, as Nexus explains, some parts — such as backlog refinement — may benefit from being an event in their own right.

Simple Scaled Scrum

The diagram shows some key concepts:

One product should have just one Product Backlog (with just one Product Owner). Multiple teams can work off this single backlog, and the increment should be reviewed collectively across all teams, in just one Sprint Review. Why? Because there is one product and one Product Backlog, reviewing each team’s work in isolation wouldn’t be conducive to effective/appropriate inspection and adaption. Transparency would likely suffer and the process would inevitably become bloated as more meetings and handoffs attempt to compensate for shortcomings.

The individual teams, and the collective, both need their own retrospectives (which ideally should be held every Sprint). Individual teams need retrospectives for the same reasons that they do in any other non-scaled Scrum team, and because they are so dependant and integral to the wider goal and associated processes they benefit from having a formal opportunity to inspect and adapt these aspects as well. The groups should do this collectively, and most preferably without any exceptions — limiting this exercise to team leads or other representatives is in most cases a false economy. If people aren’t able to be involved, even if it just to listen to what’s going on, the group will struggle to self-organise and work effectively.

Low cost of entry to the Product Backlog. The Product Owner is responsible for the backlog, and so they may choose to apply a process around inbound items, to ensure that they are understood and of an acceptable standard (have the necessary attributes, etc.), but crucially this process needs to be simple. The refinement loop can be used to better prioritise items, and apply detail as necessary, etc. but the emphasis on this entire process is simplicity. If it isn't simple, it won't work. Did I mention simplicity is key?

It’s a particularly common pattern for scaled-processes to introduce unnecessary complication and even complexity around this process. This introduces waste, increases cycle time and greatly reduces transparency around the process as a whole. I’ve seen processes where there are six or seven distinct steps/hand-offs to get a simple PBI (like a bug, or action to resolve technical debt), onto a Product Backlog and prioritised. That's six or seven steps before a Scrum team has started their work on it; that's a lot of upfront investment which has potential to produce a lot of waste.

Communities of interest allow knowledge sharing and appropriate long term strategising. When you’ve got a lot of people working towards a common goal on a project, there should be an appropriate platform for people to get together, communicate and share their thoughts. Examples might include discussion about a new JavaScript framework that might be valuable to the team/product, or changes to the unit testing strategy in an effort to improve code quality, or reduce technical debt. They may also provide a platform for things like architectural discussions and strategic actions from both a product and technical perspective.

It is common in larger projects for more than one person to have an interest or stake in the Product. This is particularly true in larger companies. History has taught us that having multiple people responsible for the Product Backlog is a recipe for problems, so providing a recognised platform for people to discuss things outside of the Sprint Review could be useful to fulfil this desire/need. It is important to remember the purpose of the Sprint Review, and that groups such as this cannot and should not take its place.

A key attribute of ‘communities of interest’ is that they are both transparent and open; anybody with an interest should be welcome to be part of them. They should meet as often as necessary, which is probably at least once every couple of Sprints. They probably shouldn’t have any distinct or tangible output in their own right; they shouldn’t be part of a backlog refinement process for example, but their members may choose to act on the discussions and decisions at the appropriate events, such as in backlog refinement or at the Sprint Review.