Surface testing

In this article, I present surface testing, a style of software testing. This style has worked extremely well for me and it might be useful to you as well.

Any piece of software has certain parts that are exposed to its users. These exposed parts of the software are sometimes called “interfaces”, but since the term “interface” is usually conflated with graphical user interfaces or with OOP-style interfaces, I’d rather use the term “surfaces”. The goal of surface testing is to thoroughly test a piece of software only through the surfaces that it exposes.

This is best understood through an example. Let’s say you’re implementing an HTTP API and you want to write tests for it. If you decide to use surface testing, you will test your API entirely through HTTP calls. For example, if you implement two endpoints, one to GET a widget and one to POST a widget, your surface testing will first make a call to create a widget through POST, then one to get the newly created widget through GET.

Surface testing stands in stark contrast to other types of testing.

  1. Unit testing: this style of testing finds the smaller subcomponents possible, whether they’re exposed or not, and tests them in isolation. Often this requires mocking the dependencies of each component. Surface testing does not care about the size of components being tested, nor it cares about testing them in isolation. And surface testing seldom requires mocking of dependencies.
  2. E2E (end-to-end) testing: this style of testing aims to test the entire system, including the graphical user interface, as a whole. Even if the app exposes an API, E2E tests will be executed against the user interface. Surface testing, in contrast, dictates that both the graphical user interface and the API should be tested separately, since both are exposed surfaces.
  3. Integration testing: this style of testing aims to only test the interactions between different systems, making sure they adhere to their respective API contracts. Surface testing doesn’t directly test the interactions; rather, it tests them indirectly through the exposed surfaces of the respective systems. If a surface X in a system A depends on a call to a system B, by testing X and obtaining satisfactory results, the interaction between systems A and B will have been tested too.

Surface testing suggests you take the venerable testing pyramid and chuck it in the bin. The testing pyramid suggests that most of your tests should be unit tests, a few should be integration tests, and precious few should be E2E tests. What this advice will yield is the following:

My contention is, in short, that if you follow the testing pyramid, you’ll end up with a test suite that is:

Surface testing is much more straightforward:

  1. List: make a list of the parts of the system that you expose, such as user interfaces, API endpoints, or main functions (if you’re writing a library). These are your surfaces.
  2. Run: have a version of your system ready that is running and connected to all it needs to function (DBs, external services, etc.). It doesn’t matter whether it’s running locally or remotely; all that matters is that it should be the real thing. If you’re testing a library, you only need to be able to run it, just as you would if you were using it.
  3. Test: test each of the surfaces of your system in the same way as a highly caffeinated human tester would. For each surface, write tests that send invalid data and make sure that the surface returns proper errors instead of proceeding or crashing. Then, move on to the correct cases, making sure that the surface adequately processes the requests.
  4. Assert: place strict assertions on the results you obtain from each test. It’s not enough (for example) to check whether a read operation gave you a 200 code. You need to check that the actual returned body is exactly what you expect, to the maximum level of detail possible.
  5. Chain: chain the tests in a logical sequence. If you’re building a CRUD, you can start by testing creation, then reading, then updates, then deletions. Usually, to test whether an update or a deletion, you’ll perform another read operation to check indeed that the update or the deletion have been successful. Not only this is OK, it’s the correct way to do it.
  6. On first error, stop. Do not continue running tests if a single case failed. Focus on eliminating that error (by either fixing the code or fixing the test) so that it doesn’t happen again. There are no partial successes. The test suite either fully passes or fully fails. This is auto-activation in action.

When it comes to assertions in your tests, you could think of them as validations of your system. In the same way that your system should validate the inputs provided to it by users, so your tests must validate the outputs produced by your system. Validations (in the system) and assertions (in the tests) are the two sides of the same coin.

What are the downsides of surface testing?

All three downsides are, to me, virtues: they make testing harder, in the same way than other hard but valuable things, like working out or learning a new language. This effort will not be in vain: it will improve your system and your understanding of it. This is in again in stark contrast with the effort expended in writing unit tests, with their accompanying mocks: those efforts usually improve an isolated part of the system and also do little to improve your understanding of the system.

Surface testing have another advantage over other types of testing: if you want to rewrite your system but maintain your existing contracts, you can simply take the tests from the old system and apply them to the new. Surface tests are as portable as your contracts.

Three important points that I haven’t covered yet:

One last point that applies to any type of testing: if you find a bug in your system, you didn’t find one bug: you found two. The first one is the bug itself; the second one is the absence of a test that would have caught that bug. When you fix the bug, make sure you also add a test that checks for the specific condition that triggered the error.

If you made it this far, I’ll gladly receive your suggestions & objections. Thanks for reading!