ITKarma picture

Did you feel that for the sake of testing you are making the code harder to read? Let's say you have code that has not been tested yet. It has a number of side effects, and you are asked to run the tests first. You start following tips like passing global variables as parameters or extracting problematic side effects to make test stubs instead .

But the further you go, the more you feel at ease. You have a bunch of stubs accumulating for states and data that allow you to write fast and reliable unit tests. As a result, you get the feeling that you have created a completely different class that does not accurately reproduce the real code.

You stop and think: “Is it permissible to change code signatures for the sake of testing? Am I testing real code or a completely different class in which what is needed does not happen? ” You may have a dilemma. Are you sure you should continue to adhere to this approach? Or is it a waste of time?

Millionth question: for outdated code, do you need to write unit tests or integration tests?

The paradox

You could be in one of these situations:

ITKarma picture

ITKarma picture

I don’t know about you, but these situations are sooooooo close to me. They are funny because they are true (and this is unpleasant).

You are told that you need to do stubs for problematic dependencies and write unit tests. But at some point there is a feeling that a blind zone is growing in your tests. What guarantees do you have that the code will behave correctly when using this database in battle? Won't you have even more problems if you integrate with the real database?

I also faced this dilemma! For a long time I promoted Testing Pyramid . They argue that you need to write more unit tests because they are faster and more reliable than integration tests.

I also did a lot of front-end work. In recent years, the mantra " Write tests. Not too much. Mostly integration ”, which is promoted by Kent Dodds . He is very smart and an authority in testing front-end applications. There are even tools like Cypress for testing most web applications with end-to-end tests! Even closer to the user!

How do you resolve this paradox? How will it be right? Should stubs be made to make tests quick and reliable? Or is it better to write integration tests that are less reliable but catch more problems?

Before we continue, let me say: great that this dilemma has come up against you. It seems like a problem to you, but that means you're moving in the right direction ! This is a natural obstacle and you will overcome it. This puzzle has a solution.

A practical look at testing

I plunged into this problem for a while and tried different approaches. The decision came after a memorable discussion with J. B. Rainsberger .

I realized that there is no paradox. There are different definitions of a “module.”

It seems to me that the word “module” is confusing. People have a different understanding of what it is. Typically, newcomers to testing consider a function or method as a module. Then they understand that it can be a whole class or module. However, truth, like so much more in life, depends on the situation.

ITKarma picture

Isolated tests

I believe that the division into “modular” and “integration” tests is not clear enough. Such categorization leads to problems and disputes, although we have one goal: to facilitate software change with the help of quick and accurate feedback when you break something.

Therefore, I prefer to talk about "isolated" tests.

Not sure if this term is better. But I like it for now, because it makes me ask an extremely important question: isolated from what?

My answer is: isolated from being hard to test.

What's hard to test

Randomness, time, network, file system, databases, etc.

I call all this "infrastructure", and the rest is called a "domain." The combination of domain and infrastructure looks like a very useful and practical look at software design. And not only for me. This separation is at the core of many maintainable architectures, such as Hexagonal Architecture (known as Ports and Adapters), Onion Architecture, Clean Architecture, Functional Core/imperative shell, etc.

All these are the nuances that say one thing: domain and infrastructure are better separated . Functional programming also promotes this idea: isolated side effects on the perimeter of your system. Side effects are infrastructure. And it's hard to test.

Yes, but how does that help?

In short, this tells you exactly what to remove and replace with plugs. Do not transfer business logic to infrastructure, otherwise you will be tormented with testing.

An isolated domain is easy to test. You do not need to do stubs for other domain objects. You can use the same code that is used in the prod. You just need to get rid of the interaction with the infrastructure.

But there may be bugs in the infrastructure code!

If you keep it compact enough, then the risks are reduced. But basically, you reduce the size of the code that you need to test when integrating with a real database.

The bottom line is this: you should be able to test all kinds of behavior of your application without saving data in PostgreSQL. Need an in-memory database. You still need to check the functionality of the integration with PostgreSQL, but for this, just a few tests are enough, and this is a noticeable difference!

It was all a theory, let's get back to reality: we have code that has not been tested yet.

Returning to Deprecated Code

I think this view of testing will help you. No need to get carried away with stubs, but when working with outdated code you have to get carried away. Temporarily. Because this code is a vinaigrette from the fact that it is difficult to test (infrastructure) and business logic (domain).

You need to strive to extract infrastructure code from the domain.

Then make the infrastructure dependent on the domain, not vice versa . There should be as little infrastructure as possible in business logic. You will get fast, reliable, isolated tests covering most of the application logic, and several integration tests to test the operation of the infrastructure code with a real database, external API, system clock, etc.

Do you still have stubs? Yes and no.

You will have alternative infrastructure implementations, so you can still execute production logic without using a real database.

How to find out that the database implementation is working correctly? Write some integration tests. Just enough to check the correctness of the behavior. J. B. Rainsberger calls this "Contract Tests." I think it’s best to explain on this example .

You can follow this recipe

Here's how to work with legacy code:

  1. Use the Extend & amp; Override to eliminate unpleasant addictions . This is a good first step. Cut out the infrastructure part from the remaining code so that you can start writing tests.
  2. Invert the dependency between the code and the extracted infrastructure . To do this, use Dependency Injection. We have to make architectural decisions: is it possible to group some extracted methods into a consistent class? For example, group what relates to saving and retrieving stored data. This is your Infrastructure Adapter .
  3. If you are writing in a statically typed language, then extract the interface from the created class . This is a simple and safe refactoring. It will allow you to complete the dependency inversion: it will make the code dependent on the interface, and not on the particular extracted implementation.
  4. In tests, make a fake implementation of this interface . Most likely, it will store information in memory. It should reproduce the same interface and behave just like the combat implementation that you pulled out.
  5. Write a test contract. This ensures that fake and combat implementations behave as you expect! The contract test will verify the correct operation of the interface. Since you run the same series of tests on both implementations, you will know that you can trust your isolated tests. You are safe! A great guide for beginners .
  6. Make the interface express business needs, not implementation details . I recommend developing architecture in this direction. But this comes with practice and should not be a hindrance.

After you do this, the tests will cover 80% of your code, because there will be no difficult places in it. It remains to write several tests to verify the correctness of the connection, but do not waste your energy on it while testing pure logic. No more blind spots!

You can also improve the architecture and usability of code maintenance. First, find out how many dependencies are actually present in the code. The more of them you know, the easier it will be to identify the missing abstractions that will simplify the code even more.

Finally, test-friendly code can be easily reused. For your experience, it has a pleasant side effect... one day you will try it.

Everything will be fine

It's not easy.

It took me several years to think. And I'm still reading, listening and trying this approach in different situations (so far it works).

But all this is important: testing, architecture and outdated code. All these skills are interconnected. Practicing in one, you grow in another.

It may seem to you that a lot of work will be required and you do not have time for this now. But I said that it takes a lot of practice. If you want, you can use the these exercises .

One last thing: You don’t have to complete all 6 steps at once. You can do this iteratively. One must believe in this process (which is why it is well suited for practice), but it works. If you are worried about wasting time on an excessive number of stubs, this is because you are still in the first stage. Move on! There will be fewer stubs, and you will have more confidence in the accuracy of the tests.

Infrastructure adapters + contract tests=missing saw parts for legacy code.

Get back to work and keep writing these tests. And then move on.