12,399 views

Michael Bolton

@michaelbolton

, 29 tweets, 6 min read

My Authors

@jamesmarcusbach

@jamesmarcusbach

1) @jamesmarcusbach and I have recently been working on our notion of deep testing. In the Rapid Software Testing namespace, we define “deep testing” as: *testing that maximizes the probability of finding every elusive bug that matters*.

2) By contrast, shallow testing is not intended to maximize the probability of finding every elusive bug that matters. But that doesn’t mean that “deep” is necessarily good, nor that “shallow” is an insult; far from it. For one thing, we *always* start with shallow testing.

3) Deep testing must be bootstrapped by shallow testing. One crucial outcome of shallow testing is that we get a sense of where it might be important to go deep on the one hand, or unnecessary—or unnecessarily expensive—to go deep on the other. Shallow testing might suffice.

4) Except by luck, finding elusive bugs tends to be expensive in one dimension or another. A bug might be elusive because it depends on a very limited set of conditions, or a set of conditions that line up infrequently or intermittently. A bug might be elusive through emergence.

5) An emergent bug is one that doesn’t exist in any single part of a system; it results from system dynamics, interactions between parts that are fine on their own. Such bugs tend to be subtle, not apparent by looking at units or sub-systems in isolation.

6) A bug might be elusive because our models of *potential problems in the product* often lag behind our models of *the product*. That is, a bug may become easier to anticipate or to observe as we gain experience with the product; as we develop ideas about coverage and oracles.

7) Although though we can prepare for them, we can’t very well schedule invention, insight, or discovery. Deep testing, therefore, tends to take time. It takes determination. It tends to take some degree of experience with the product, its domain, technology, and testing itself.

8) Deep testing demands *requisite variety* (cf. Ross Ashby) to trigger conditions that help to reveal elusive bugs. That includes variety in testing activities; quality criteria; models of coverage; (data, timing/sequencing, user actions, platforms/environments...); oracles...

9) Managers can foster deep testing by providing time, resources, and other aspects of project- and value-related testability. Designers and developers can help by providing intrinsic testability. Testers can develop subjective testability. (cf. satisfice.com/download/heuri…)

10) Now: deep testing takes time, determination, and resources. When might you NOT want to do it? One: you might not want to do it YET, or ALL THE TIME. You might want to do shallow testing now to learn about the product and attendant risks; to learn how and where to go deep.

11) You might prefer shallow testing now to maintain appropriate discipline and control in building the product, while avoiding the effort, expense, and interruptions associated with deep testing; then do deep testing when you’ve built something that’s ready for it.

12) In other words, you probably don’t want to do deep testing before you and the system you’re building are ready for it. You also probably don’t want to to deep testing when you reasonably believe there are no elusive bugs, or that any elusive bugs that exist won’t matter.

13) Let’s return to the business of “shallow” being a potential insult. We insist that it isn’t an insult, though it is an assessment of the thoroughness of testing. Assessment of depth informs an assessment of the quality of the testing relative to assumptions about the product.

14) If your assumptions about the state of the product are *reckless*, then shallow testing will help to reveal that quickly and inexpensively, even powerfully. Yay shallow testing! Deep testing isn’t needed in that case; it would be expensive and probably wasteful.

15) Assumptions can be *risky*—that is, there’s a possibility that they’re incorrect. That’s where testing likely needs to be appropriately deep—to help manage risk and uncover instances where assumptions turn out to be wrong.

16) Assumptions can be *safe*—that is, they’re likely to be correct, and unlikely to be incorrect. One of the things that makes skilled testers different is this: we question assumptions that other people consider to be safe. We anticipate and hunt problems with safe assumptions.

17) There’s a natural tension and a certain amount of social awkwardness in this: testers often advocate for deeper testing when others believe that shallow testing suffices. Two approaches will reveal problems with safe assumptions: deep testing now; problems in the field later.

18) There’s another kind of assumption; we’re arguing over the label for it. James likes “required”, in the sense of *socially* required; assumptions so safe that if you try to manage or even mention them, your social group will think you to be crazy, or rude, or at best joking.

19) In this case, the deeper your testing, the more others will believe you to be obsessive-compulsive, or naïve; poorly calibrated towards important risk. For super-safe assumptions, even shallow testing may be too much—like testing Windows or Chrome before your new product.

20) It’s part of our mission as testers to perform deep testing where it’s necessary, and avoid it where it isn’t. It’s also part of our mission to advocate for shallow testing—not deep—where it’s sufficient, or where it helps us in going deep.

21) Deep testing can be way easier with skilled use of tools, and way harder without it. The key word here is *skilled*, which includes expertise in the extents and limits of tools, and in testing. Tools can afford lots of that variation I referred to earlier, when you want it.

22) Got a suite of automated checks that are designed to support quick, disciplined development and building? They often provide shallow data coverage—entirely appropriate to the task in that context. You can tweak and (re)use them in deep testing by perturbing and varying data.

23) Got a set of use cases that are covered by automated checks? Yay! Now, to do deep testing, periodically gain interactive experience with the product, performing variations within those use cases. In particular, question assumptions that inform the use case. Consider MISuse.

24) Notice how use cases assume well-trained, relaxed, undisturbed users. That’s fine for shallow testing by automated checks. Now unbury, examine, and systematically overturn assumptions about how people will use the product, or about the normal states and sequences of events.

25) Remember that the goal of deep testing is to maximize the probability of finding *elusive* bugs that matter. Ponder what makes a bug elusive: subtlety and rarity, each to some degree. The breadth and power of your oracles attacks subtlety; your coverage confronts rarity.

26) (That’s a simplification, but not an outrageous one.) Some of the approaches for investigating intermittent bugs are useful for finding them in the first place. (cf. satisfice.com/blog/archives/…) And developing and broadening your oracles can help make your testing deeper.

(On oracles, cf. developsense.com/blog/2012/07/f… and developsense.com/blog/2015/09/o…). For some examples of shallow versus deep oracles, consider a check-engine light compared to a skilled mechanic’s evaluation; a smoke detector compared to an alert, observant person’s perception of danger.

28) Note that no one advocates eliminating smoke detectors or check-engine lights (“alerts”); they’re useful—so useful that having them is socially required in the face of risk. Note that the success of alerts depends on the explicit, and on already known, foreseeable problems.

29) Note that even when good instrumentation is available, awareness of subtle, hidden, surprising, rare, intermittent, emergent problems depends not (only) on explicit knowledge, but on tacit knowledge (too). To go appropriately deep, we need skilled, socially aware humans. -end

Enjoying this thread?

Try unrolling a thread yourself!

Trending hashtags

Enjoying this thread?

Try unrolling a thread yourself!

More from @michaelbolton see all

Related threads

Trending hashtags

Did Thread Reader help you today?