Imagine a waterfall. Powerful. Irreproachable. Always moving forward towards an imminent descent. Driven by one of several fundamental forces in the universe.

Waterfalls are staggering by their very nature, so it is not surprising that engineers are a little obsessed with them. The old standard DOD-STD-2167A recommended using a waterfall model, and my outdated engineering background was based on of the Phase-Gate model , which, in my opinion, is pretty darn like a waterfall. On the other hand, those of us who studied computer science at the university probably know that the waterfall model is to some extent anti-pattern . Our friends at the ivory academic tower tell us that no, no, Agile is the path to success and it looks like industry proved the truth of this statement.

So, what should a developer choose between an aging waterfall model and a newfangled Agile? Does the equation change when it comes to developing algorithms? Or some critical security software?

As usual in life, the answer is somewhere in between.

Hybrid, spiral and V-shaped models


Hybrid development is the answer in the middle. Where the waterfall model does not allow you to go back and change requirements, the hybrid model allows this. And where Agile has problems with pre-engineering, hybrid development leaves room for it. Moreover, hybrid development aims at reducing defects in the final product, which we probably want when developing algorithms for applications that are critical from a security point of view.

Sounds good, but how effective is it?

To answer this question, we place a bet on hybrid development while working on NDT localization algorithm . Localization is an integral part of any stack of unmanned driving that goes beyond purely reactive control. If you do not believe me or are not familiar with localization, I highly recommend that you take a look at some project documents that were developed as part of this process.

So what is hybrid development in a nutshell? From my amateur point of view, I would say that this is an idealized V-shaped , or spiral model . You plan, design, implement, and test, and then iterate the entire process based on lessons learned and new knowledge that you have acquired during this time.

Practical Application


More specifically, we, with the NDT working group at Autoware.Auto, finished our first descent along the left cascade of the V-shaped model (that is, completed the first iteration through the design phase) in preparation for Autoware Hackathon in London (hosted by Parkopedia !).Our first pass through the design phase consisted of the following stages:

  1. Literature review
  2. Overview of existing implementations
  3. Designing high-level components, use cases and requirements
  4. Trouble Analysis
  5. Definition of metrics
  6. Architecture and Design API

You can take a look at each of the resulting documents if you are interested in something similar, but in the rest of this post I will try to parse some of them, and also explain what and why it came out of at each of these stages.

Literature and existing implementations overview


The first step of any worthy undertaking (which is exactly how I would classify the implementation of NDT) is to see what other people have done. People, after all, are social beings, and all our achievements stand on the shoulders of giants.

If we discard allusions, then there are two important areas that should be paid attention to when considering the “art of the past”: academic literature and functional implementations.

It is always useful to look at what poor graduate students worked on in the midst of hungry times. In the best case, you will find that there is an absolutely excellent algorithm that you can implement instead of yours. In the worst case, you will gain an understanding of the space and variations of solutions (which can help the information architecture), and you will also be able to learn about some of the theoretical foundations of the algorithm (and, thus, about which invariants you should follow ).

On the other hand, it’s just as useful to look at what other people are doing - after all, it is always easiest to start doing something with a start prompt. You can not only borrow good architectural ideas for free, but also discover some guesses and dirty tricks that you may need in order for the algorithm to work in practice (and, perhaps, you can even fully integrate them into your architecture).

From our NDT literature review , we have compiled the following useful pieces of information:

  • The NDT family of algorithms has several variations:
    - P2D
    - D2D
    - Limited
    - Semantic
  • There are tons of dirty tricks that you can use to make the algorithm work better.
  • NDT is usually compared to ICP
  • NDT is a little faster and a little more reliable.
  • NDT works reliably (has a high success rate) within a specific area

Nothing improbable, but this information can be saved for later use, both in design and implementation.

Similarly, from our overview of existing implementations , we saw not only concrete steps, but also some interesting strategies initialization.

Use cases, requirements, and mechanisms


An integral part of any development process using the “design or plan first” method is to consider the problem that you are trying to solve at a high level.In a broad sense, from the point of view of functional safety (which, I admit, is far from an expert), "a look at the problem is high level ”is organized approximately as follows:

  1. What use cases you are trying to solve?
  2. What are the requirements (or limitations) of the solution to satisfy the above use cases?
  3. What mechanisms satisfy the above requirements?

The process described above provides a disciplined view of the problem from a high level and gradually becomes more detailed.

To get an idea of ​​how this might look, you can take a look at the high-level localization design document that we put together in preparation for the NDT development. If you're not in the mood for bedtime reading, then read on.

Use cases


I like three mental approaches to use cases (attention, I'm not a specialist in functional safety):

  1. What should a component do at all? (remember SOTIF !)
  2. What are the ways in which I can enter input into a component? (input use cases, I like to call them ascendant)
  3. What are the ways I can get output? (weekend or top-down use cases)
  4. Bonus question: What integral system architectures can this component reside in?

Putting it all together, we came up with the following:

  • Most algorithms can use localization, but in the end they can be divided into varieties that work both locally and globally.
  • Local algorithms need continuity in their conversion history.
  • You can use almost any sensor as a source of localization data.
  • We need a way to initialize and troubleshoot our localization methods.

Besides the various use cases that you might think of, I also like to think of some common use cases that have very strict requirements. For this, I have the option (or task) of a completely unmanned off-road trip passing through several tunnels with movement in a caravan. There are a couple of troubles in this use case, such as accumulation of odometry errors, floating-point calculation errors, localization correction, and outages.

Requirements


The purpose of developing use cases, in addition to summarizing any problem you are trying to solve, is to define requirements. In order for the use case to take place (or be satisfied), there are probably some factors that must be realized or possible. In other words, each use case has a specific set of requirements.

In the end, the general requirements for the localization system are not so scary:

  • Provide conversions for local algorithms
  • Provide Transformations for Global Algorithms
  • Ensure the mechanism for initializing the relative localization algorithms
  • Make sure that the transformations do not grow
  • Ensure compliance REP105

Qualified functional safety specialists are likely to formulate much more requirements. The value of this work lies in the fact that we clearly formulate certain requirements (or limitations) to our design, which, like the mechanisms, will satisfy our requirements for the operation of the algorithm.

Mechanisms


The final result of any kind of analysis should be a practical set of lessons or materials. If as a result of the analysis we cannot use the result (even negative!), Then the analysis was wasted.

In the case of a high-level engineering document, we are talking about a set of mechanisms or a design that encapsulates these mechanisms that can adequately satisfy our use cases.

This specific high-level localization design made it possible to obtain a set of software components, interfaces and behaviors that make up the architecture of the localization system. A simple block diagram of the proposed architecture is given below.

image

If you are interested in more information about architecture or design, I highly recommend that you check out the full document .

Trouble Analysis


Since we are developing components in systems that are security critical, disruptions are something that we should try to avoid, or at least mitigate, their consequences. Therefore, before we try to design or build anything, we must at least know how things can break.

When analyzing faults, as in most cases, it is useful to look at the component from several points of view. For NDT algorithm failure analysis we reviewed it in two different ways: as a general (relative) localization mechanism, and specifically as an instance of the NDT algorithm.

When viewed from the point of view of the localization mechanism, the main failure mode is formulated as follows - "what to do if the wrong data is input?" Indeed, from the point of view of an individual component, little can be done, except perhaps to conduct a basic check on the adequacy of the system. At the system level, you have additional features (for example, the inclusion of security features).

Considering NDT as an algorithm in isolation, it is useful to abstract from the algorithm, highlighting the appropriate number of aspects. It will be useful to pay attention to the version of the algorithm written in pseudocode (this will help you, the developer, better understand the algorithm). In this case, we analyzed the algorithm in detail and studied all the situations in which it might break.

An implementation error is a perfectly reasonable failure option, although it can be fixed using appropriate testing. Slightly more often and more insidiously, some nuances regarding numerical algorithms began to appear. In particular, we are talking about finding inverse matrices or, in a more general case, solving systems of linear equations, which can lead to numerical errors. This is a pretty sensitive crash option, and you need to pay attention to it.

Two other important failure options that we also identified are checking that certain expressions are not unlimited in size (controlling the accuracy of floating-point calculations), and also checking that the size or size of the inputs is constantly monitored.

In total we made 15 recommendations . I would recommend that you familiarize yourself with them.

I also add that although we have not used this method, fault tree analysis is an excellent tool for structuring and quantifying the analysis problem crashes.

Metric Definition


“That which is measurable is manageable”
- A popular phrase of managers
Unfortunately, in professional development it’s not enough to shrug and say “Done” when you are tired of working on something. In fact, any work package (which is again the development of NDT) requires the adoption of criteria that must be agreed upon by both the customer and the supplier (if you are both the customer and the seller, skip this step). All jurisprudence exists to support these aspects, but as engineers, we can simply cut out intermediaries by creating metrics to determine the degree of readiness of our components. In the end, the numbers are (mostly) unambiguous and irrefutable.

Even if the criteria for accepting work are not needed or do not matter, it’s still nice to have a well-defined set of metrics that characterizes and improves the quality and productivity of the project. In the end, what is being measured is verifiable.

For our NDT implementation , we divided the metrics into four broad groups:

  1. General Software Quality Metrics
  2. General Firmware Quality Metrics
  3. General Algorithm Metrics
  4. Localization-specific metrics

I will not go into details because all these metrics are relatively standard. The important thing is that the metrics have been defined and identified for our particular problem, and this is roughly what we can achieve as open source project developers. Ultimately, the bar for adoption must be determined based on the specifics of the project for those who deploy the system.

And the last thing I will repeat here is that although metrics are fantastic for tests, they are not a substitute for checking the understanding of the implementation and usage requirements.

Architecture and API


After painstaking work on the characterization of the problem that we are trying to solve and the formation of an understanding of the solution space, we can finally plunge into the area bordering on the implementation.

Recently, I have been a fan of test-driven development . Like most engineers, I like the development processes, and the idea of ​​writing tests in the first place seemed cumbersome to me. When I started programming professionally, I went straight ahead and did testing after development (despite the fact that my university teachers told me to do the opposite). Research shows that writing tests before implementation, as a rule, leads to fewer errors, higher test coverage and, in general, better code. Probably more importantly, I believe that test-based development helps deal with the significant problem of algorithm implementation.

What does it look like?

Instead of introducing a monolithic ticket called “Implement NDT” (including tests), which will result in several thousand lines of code (which cannot be effectively viewed and studied), you can break the problem down into more meaningful fragments:

  1. Write classes and public methods for the algorithm (create architecture)
  2. Write tests for the algorithm using the public API (they should fail!).
  3. Implement the logic of the algorithm

So, the first step is to write the architecture and API for the algorithm. I’ll tell you about other steps in another post.

Despite the fact that there are many works that talk about how to “create architecture”, it seems to me that designing a software architecture has something in common with black magic. Personally, I like to think of perceiving the development of software architecture as drawing boundaries between concepts and an attempt to characterize the degrees of freedom in posing the problem and how to solve it from the point of view of concepts.

What then are the degrees of freedom in the NDT?

A review of the literature tells us that there are various ways of representing scanning and surveillance (for example, P2D-NDT and D2D-NDT). Similarly, our high-level engineering document suggests that we have several ways of representing the map (static and dynamic), so this is also a degree of freedom. More recent literature also states that the optimization problem may be revised. Nevertheless, comparing the practical implementation and the literature, we see that even the details of the optimization solution may differ.

And the list goes on and on.

Following the initial design results , we settled on the following concepts:

  • Optimization Issues
  • Optimization Solutions
  • Scan view
  • Map view
  • Initial hypothesis generation systems
  • Algorithm and Node Interfaces

With some unit inside these points.

The ultimate expectation of architecture is that it must be extensible and maintainable. Whether our proposed architecture is in line with this hope, only time will tell.

Next


After designing, of course, it's time for implementation. The official implementation of NDT in Autoware.Auto was done at Autoware hackathon , organized by Parkopedia .

It should be repeated that what was presented in this text is only the first pass through the design phase. It’s known that not a single battle plan can withstand meeting an enemy, and the same can be said about software design. The final failure of the waterfall model was carried out on the assumption that the specification and design were implemented perfectly. There is no need to say that neither the specification nor the design is perfect, and as it is introduced and tested, shortcomings will be discovered and changes will have to be made to the designs and documents set forth here.

And this is normal. We, as engineers, are not our work and are not identified with it, and all we can try to do is iterate and strive for perfect systems. After all the words said about the development of NDT, I think we took a good first step.

Subscribe to the channels:
@TeslaHackers - community of Russian Tesla hackers, rental and drifting training on Tesla
@AutomotiveRu - automotive industry news, hardware and driving psychology




image

About ITELMA
We are a large company developing automotive components. The company employs about 2,500 employees, including 650 engineers.

We are perhaps the most powerful competence center in Russia for the development of automotive electronics in Russia. Now we are actively growing and have opened many vacancies (about 30, including in the regions), such as a software engineer, design engineer, lead development engineer (DSP programmer), etc.

We have many interesting challenges from automakers and concerns driving the industry. If you want to grow as a specialist and learn from the best, we will be glad to see you in our team. We are also ready to share expertise, the most important thing that happens in automotive. Ask us any questions, we will answer, we will discuss.

Read more helpful articles:

.

Source