ITKarma picture


My name is Timothy, I am a Python developer in the Cyan Company Platform team. Our team is developing tools for product developers. These are libraries: HTTP client, web server, database access libraries, and tools for monitoring microservices and the site as a whole, integration with CI/CD and much more.

Today I’ll talk about a new tool that we recently developed - the framework for functional tests.

But for starters.

Why do we need tests

In short, we believe that tests help capture system behavior. So that after writing new functionality or refactoring the old, we can verify that everything that was written earlier does not break, and the existing functionality works the same as before the change.

At first, any code is simple, there can be only a couple of test cases on it. Such a number of cases can be checked by hand. But when another developer comes to the project, he may not know about these cases and break something. And the code base is only increasing, there are more and more cases, checking everything by hand is no longer an option.


We have adopted a policy in Cyan to cover the code with tests. In a review, we measure diff-coverage: the percentage of lines touched by pull-request that are covered by tests. At the moment, our sanitary minimum is 80%, and we are preparing to automatically reject pull-request with diff-coverage below this number. Thus, new code is always almost completely covered by tests.

By the way, such a mechanism is very convenient when the code is not covered by tests at all, and you want to cover the new code. In addition, in the utility report it’s quite convenient to see if some important part of the code is missing: we see only diff on the screen, in which uncovered lines are highlighted in red:

ITKarma picture

Also, in Cyan, developers write API tests that test microservice in a real environment: dev, beta, or even prod, but we will not focus on them in this article.

Unit tests

Until recently, Cyan developers had only one tool for obtaining the coveted percentage of coverage - unit tests. But is it convenient?

With all our love for unit tests, they are not a silver bullet and have unpleasant flaws:

They don’t check the entire microservice. The fact that all components of the system work correctly separately does not mean that the system works correctly as a whole.

Break when refactoring. As soon as we divide 2 classes into 3 - we have to rewrite the tests. If we can’t check the refactoring with existing tests, then we can’t be sure that everything will work as before. You can even say that unit tests “freeze” the architecture and prevent it from changing.

And because of this problem, developers simply stop refactoring. It is subconsciously scary to change something that cannot be verified. As they say: it works - don’t touch it!


We decided that you can’t live like that anymore and the developers need a new tool for writing a new kind of test. We called them functional tests .

In fact, these are isolated API tests for microservices. Here, the API is understood in its literal sense as the Application Programming Interface, that is, any microservice interfaces, whether it be the HTTP API, crowns, or RabbitMQ/Kafka consumers.

For testing, all necessary databases, message broker, HTTP Mock Server are raised in the docker, and the microservice starts automatically with the settings pointing to them.

ITKarma picture

These tests are designed to:

  • Improve quality , by testing multi-step scripts with API calls, processing RabbitMQ messages, launching crowns.
  • Increase development speed , by reducing the number of manual checks.
  • Simplify refactoring , by checking the entire microservice as a black box, without going into details about how it is implemented inside.
  • Simplify code coverage by collecting coverage while the microservice is running and combining it with unit test coverage.

They decided to make the tool cross-platform, with the ability to test both microservices in Python and C #, and in the future on frontend microservices on NodeJS in integration with the browser. We chose Python and the well-known framework pytest for implementation. It is already known to Python developers for unit tests, C # developers write API tests on it. In addition, pytest allows you to write fairly powerful plugins, which we used.

Throw unit tests?

Of course not! Functional tests are also in no way a silver bullet, they have their drawbacks:

  • Although these tests are quite fast, they are still slower than unit tests. This problem is especially noticeable in parameterized tests.
  • It’s harder to analyze the fall of the test. If we have implemented any binary tree inside our program, finding errors in it according to the report of functional tests is a non-trivial task.
  • Not as stable as unit tests, although this does not prevent us from dropping the CI-pipeline even if one test crashes.
  • Some scenarios are impossible, and if possible, it is irrational, to verify functional tests, for example, competitive access to objects in a multi-threaded environment.

Therefore, we encourage our developers to use different types of tests where they are good.

For example, in a hypothetical user registration API, you can cover all basic scripts with functional tests, and the password security function with parameterized unit tests.

Framework features

Of course, to run the tests, you first need to run all the necessary databases and other services. To do this, we use our own microservice configuration format called app.toml. In it, we have already described the configuration of the microservice deployment in a uniform form for all the programming languages ​​used, and now we also describe the configuration for tests:

[[dependency]] type="postgres" alias="users" [[dependency]] type="rabbitmq" 

Run the command line with the framework utility:

cian-functional-test-utils deps up 

Under the hood of this command, the microservice config is read, docker-compose.yml is formed, and CDMY0CDMY, familiar to everyone, is launched. We do not recommend giving developers “naked” docker-compose, the framework usually knows better how to run the same Elasticsearch, so that it works reasonably well and does not eat up all the RAM. Also, its own format allows you to describe container metadata necessary for the framework, such as CDMY1CDMY in our example.

Next, in the file, we describe the preparation of the database and the process of starting the microservice:

@pytest.fixture(scope='session', autouse=True) async def start(runner, pg): # Так как все микросервисы Циан имеют один и тот же интерфейс, # фреймворк знает как их запускать и куда идти за health-check. await runner.start_background_python_web() # Можно запускать и тестировать не только HTTP API, но и RabbitMQ консюмеры, кроны await runner.start_background_python_command('save-users-consumer') @pytest.fixture(scope='session') async def pg(postgres_service): db=await postgres_service.create_database_by_alias('users') # Используем `pathlib.Path` для кроссплатформенности. await db.execute_scripts(Path('database_schemas')/'postgres.sql') return db 

Preparation is complete! And here is the first test:

async def test_v1_get_user(http, pg): # тот самый pg из # arrange await pg.execute('INSERT INTO users (id, name) VALUES (1, "Bart")') # act response=await http.request('GET', '/v1/get-user/', params={'id': 1}) # assert assert response.status == 200 assert == {'id': 1, 'name': 'Bart'} 

Similarly PostgreSQL has support for MsSQL, Cassandra, Redis, Elasticsearch.

We figured out the HTTP API, now let's see how you can check the work of the consumer:

async def test_save_users_consumer(pg, queue_service): # arrange # Перед каждым тестом все очереди RabbitMQ удаляются, # нужно подождать, пока консюмер пересоздаст её. await queue_service.wait_consumer(queue='save-users') # act await queue_service.publish( exchange='users', routing_key='user.created', payload={'id':1, 'name': 'Bart'}, ) await asyncio.sleep(0.5) # Подождем немного, чтобы консюмер обработал сообщение # assert row=await pg.fetchrow('SELECT name FROM users WHERE id=1') assert row['name'] == 'Bart' 

Before each test, we delete all queues in RabbitMQ (our supervisors are configured to reconnect in such cases), we clean all the tables in the databases. Why before the test? So that you can go to the database or admin panel of RabbitMQ and see what lies there at the time the test crashes.

Thus, each test is performed in a clean environment, in most cases there is no difference which tests to run and in what order, the results will be the same.

Yes, sometimes there is a difference. For example, our microservices may have a local cache in RAM. It is impossible to flush such a cache without providing any external API. We chose a rather “dirty” solution: if a microservice is started with a certain environment variable, an additional HTTP server rises on the port transferred to it, we call it Management API. Since all microservices for the cache use our own library, there is no difficulty in creating an API that cleans it. And all this logic is protected in our libraries, developers do not need to do anything for this.

As a result, each application process rises with an additional HTTP server. Before the test, the framework sends a request to all of them to clear the local cache.

HTTP moki

For HTTP mocks, we chose the mountebank tool . He knows how to listen to several ports (by port for a spoofed service) and is fully configured via HTTP. Working with it directly is not very convenient, so we made a small wrapper, which in practice looks like this:

@pytest.fixture(scope='session') async def users_mock(http_mock_service): # Нужно лишь указать имя микросервиса, который мы хотим замокать, # фреймворк автоматически добавит URL мока в настройки микросервиса. return await http_mock_service.make_microservice_mock('users') def test_something(users_mock): # arrange stub=await users_mock.add_stub( method='GET', path='/v1/get-user/', response=MockResponse(body={'firstName': 'Bart', 'lastName': 'Simpson'}), ) # act # do something # assert # Проверяем, что запрос в мок был сделан с ?userId=234 request=(await stub.get_requests())[0] assert request.params['userId'] == '234' 

Under the hood, when creating the microservice moka, a stub is automatically created, which responds to all requests with the code 404. Stubs mountebanks are stored in a list and prioritized in order in it. If, for example, our stub with 404 is the first in the list, then we will always get 404, regardless of the availability of other stubs, they simply won’t get to it. This will not work, so creating a stub with us always puts it in the penultimate position on the list (before 404). The earlier a stub is declared in the code, the more priority it is.

Another interesting feature of mountebank is that, by default, requests to moki are not saved. It would be impossible to implement the test above with this behavior. There are two solutions:

  • parameter recordRequests - it’s bad that it drops requests into the general heap, but does not save them for each stub separately;
  • CDMY2CDMY command line option - solves the problem perfectly.

Test structure

As you can see, the developer needs to write session fixtures for each database used, HTTP mock. The CDMY3CDMY fixture, which describes the application startup process, depends on all of them and has the CDMY4CDMY parameter. Thus, before starting any tests, all databases and http-mokas are initialized, application processes are launched.

The framework has many different features, for example:

  • collection of application logs to verify logging;
  • collection of telemetry statsd and graphite for its verification;
  • interception of messages sent to RabbitMQ, for testing producers.

I don’t want to dwell on them, because the framework is proprietary, and you most likely are not very interested.


If you suddenly also decide to create a similar framework inside your company, then you definitely cannot do without writing documentation for it.

We analyzed existing tools and didn’t find anything better than the good old Sphinx . reStructuredText at first broke my brain, but after a couple of pages, you begin to realize the full power of these tools.

Our documentation contains:

  • a brief description of what the tests are for the developer to write;
  • an article on setting up the environment for writing them, which is especially relevant for our C # developers;
  • simpler and more complex use cases;
  • API Reference, where you can see a detailed description of all classes and methods.

As a result, documentation should be such that, after reading it, the new developer can immediately begin to write the correct functional tests.

Type annotations

Python is a dynamic typing language. Climbing into the documentation every time I forgot the name of the required method is unpleasant and unproductive. Therefore, we covered the framework code with type annotations of PEP 484 .

Thanks to this and the pytest support of PyCharm, the IDE auto-complements fixture methods, just like in regular code:

ITKarma picture

There are pytest support in other Jetbrains IDEs with the Python Community Edition plugin installed.Our C # developers use Rider - for them who are so used to static typing and IDE hints, this is especially important.


As you may have noticed, the resulting framework is strongly tied to the architecture of microservices in Cyan. It is impossible to use it in other companies, so there is no point in posting it in Open Source.

We just want to show that the development of such a framework, in our opinion, pays off:

  • quality received;
  • development speed;
  • reduction of technical debt for refactoring;
  • finally, the happiness of developers who are tired of writing a lot of sometimes useless unit tests.

We can’t give any figures, statistics, that it is really so, it’s just our subjective opinion, based on the experience of writing functional tests with us.

That's all, ask questions in the comments.

Thank you for your attention.