On Problem Solving Methods

November 23, 2023

architecture complexity management

I’ve always enjoyed solving problems. The idea that the solution isn’t always quite evident and that there were many ways to get “to victory” always appealed to me. For most of my career, I’ve been on the technical end of problem solving. A client wants to build an application, and they need application infrastructure to support their journey. There are always multiple actors in such systems, and there is always a balance that you need to achieve between the overall goals of the system and the individual goals of the actors in the system.

As I reflect, I feel like the problem solving I was doing back in the earlier days in my career was easier than it is now. Why is that? Was this because I was better before (and have somehow degraded over time)? Were the problems just simpler back then (has technology evolved, and the options evolved to make everything harder)? After spending a week at a retreat on complex adaptive systems I think I have finally found some better answers.

Granularity / Scope

I think the first reason why problem solving was simpler back then was because of the granularity of the part of the system I was interacting with. Basically, my job function dealt with a small portion of the overall system I was dealing with, and because of this, the problem space to figure out a solution from was limited.

Let me try and expand on this a bit. From a problem solving perspective, you generally have to scope the part of the system that you are working on. When I mean system here, I mean the people/process/technology that goes in to taking a set of inputs and producing a set of outputs. Getting this scope right is super important to discovering the range of solutions to select from. Sometimes, the scope you can select is limited by your agency to change or affect the broader system. For example, as a software developer, you may only have the agency to change a small portion of code, and you have to consider the rest of the system (code, processes, people, etc) as fixed constraints. Another way to look at this is by walking up the stack of architecture (code, solution, enterprise, business, etc). The boarder the scope of the architect, the broader the system under test, and (hopefully) the larger the agency to change the components of the system.

Other times, the scope of the system can be artificially constrained in other ways. I like to think of this as the ways in which the system can actually change. One easy example here is that the amount of change that the system can bear is relational to the amount of money it would cost to make that change. So, the realm of possible solution outcomes is constrained by money. Practically, most of these artificial boundaries are actually permeable, more like a guardrail on a highway than a mountain wall. If you had enough reason to do so, you could likely break through the guardrails.

I feel like in my current line of work, getting the granularity right is very tricky. This isn’t just because you need to understand more about the system under test (the people/process also included in the system) but you also need to understand which of the boundaries/constraints of the system are actually permeable, and which ones are not. There are more knobs to pay with now, and therefore a broader range of possible outcomes.

Complex Adaptive Systems

Much of management theory these days likes to view systems in terms of their abilities to take a set of inputs and reliably create a set of outputs. This type of mechanistic view of the world allows for the ability to reduce systems from wholes to the sum of their parts, make changes (or optimizations) to those parts, and then recombine the parts to re-form the whole. This type of view of the world is, maybe, fine for technology only problems, but breaks down when you include people/process into the equation. Why? This is because human systems are a subset of systems that we would term complex adaptive, and these systems have materially different properties that need to be considered.

It would be impossible for this blog post to cover all the theory you would need to understand this concept properly, so here are some bullets of things to keep in mind when dealing with complex adaptive systems.

The relationship between inputs and outputs are not fixed in the same way as mechanical systems
These systems are not linear as properties of the whole cannot necessarily be broken down into properties of it’s component parts
These systems evolve together with their component parts (or, rather, are entrained together)

What does this ultimately mean? Complex adaptive systems are quite different from traditional systems and you can’t use patterns and practices from traditional systems theory in this space. The rule of thumb here? It is likely that your system under test is always complex adaptive (as we build and design for and with humans). The key is to understand your boundaries and the granularity of the system you are working with.

Problem Solving Methods

So, what does this all mean? Well, put simply, depending on the system under test, you will need to employ different problem solving methods to actually achieve what you want. Driving from theory there are 4 primary problem solving methods.

Sense-Categorize-Respond

This problem solving method generally works when the problems you are trying to solve are simple, and have known solutions. Remember, problems can be simple either because they actually are, or because you have constrained the system (by setting the boundaries) such that they are. These “clear” problem require you to sense what the scope of the clear problem is, categorize it based on the system of best practices you can apply, and then go out and apply that best practice.

An example here would be when a client wants to implement a capability, but are happy with a commercial off the shelf product in default (or close to default) configuration. At this point, you are going to gather requirements, pick the product that is of best fit, and implement.

Sense-Analyze-Respond

This problems solving method generally works when the problems you are trying to solve are complicated in nature. This is a technical term, which means that, while there is predictability between sets of inputs and outputs, the system has many moving components that need to work together to accomplish the goal. These problems require you to sense the current state, analyze the possible set of good practices that could be applied, and then respond based on your analysis.

An example here would be implementing landing zones for a given public cloud provider. There are many components and considerations to bring into scope, and some of those are human processes. For example, let’s say you want to take a vending-machine approach to cloud adoption, enabling citizen integrators to use the cloud with guardrails. This approach is quite different than one supported by a platform engineering team responsible for service deployment and design, and requires deploying a different set of services to accomplish.

Probe-Sense-Respond

This problem solving method generally works when you are trying to “find” the solution that works best in the context of your organization. The key point to remember is that the solution isn’t known (or isn’t knowable) ahead of time, and you need to see/experience the solution before you know it is the right one for you. Further, right is actually defined in terms of an acceptable set of trade-offs and also in terms of the emergent properties of the system, not in terms of linear outcomes. In this space, you generally probe (or monitor) first, sense what to do, then respond. You rinse/repeat this process until the outcomes fit within acceptable parameters for your organization.

To build upon the example in the last problem solving method, what if your organization isn’t sure if a vending machine model will work for them, or, aren’t sure where the vending machine model would work or what supports teams would need to effectively use the model. You may want to run multiple safe-to-fail experiments to understand how the organization will respond when this new capability is introduced. After the experiments are run, you will sense the changes to the organization, and then respond accordingly.

Act-sense-respond

This problem solving method is great in times of uncertainty, or when rapid response is required. The idea is your environment is entering a sate of chaos, so you need to act (do something) then see what the results are. If you like the results, amplify them, if you don’t like the results, repulse them.

This type of chaotic state can be useful for innovation, as you are effectively trying new/novel approaches (or at least have the ability to do so) and are just gauging what happens. This state may be required if you find yourself in the middle of a security incident without a standard incident response plan to follow, or are dealing with a new/novel/unique threat and need to respond accordingly.

Conclusion

A couple of closing thoughts here. Firstly, this isn’t a one-size fits all’s. Based on your scope of project, you may decide to further break down the project and use different problem solving methods in different aspects of the project, rolling the results up. Secondly, we rarely deal with simple problems (at least in my space). Thirdly, once you add people/process to you scope of problem, you need to be thoughtful about your problem solving approach. Sometimes the solution is knowable, but in most cases, you won’t know what is acceptable (across the entirety of the system) until you see it. This is because of how complex adaptive systems are non-linear in nature, and the properties you may care about are actually emergent properties of the system, and not directly related to any one component within the system.

Hope that helps!