Linear approximation of a combination of lines for a set of noisy points
Statement of the problem
Consider the problem of approximating a combination of straight lines over a set of noisy coordinates of points located on a given combination of lines (see Fig. 1 and Fig. 2). The usual linear approximation formula is not suitable here, since the points are mixed and the result will be a certain average line between them (see Fig. 3).
Fig. 1 Combination of lines and a noisy set of coordinates
Fig. 2 Combination of lines and a noisy set of coordinates on an enlarged scale
Fig. 3 Linear approximation result
The only way that came to mind was to iterate over different line options. Those. we sort through all possible angles, naturally on a limited grid, from -90 degrees to +90 degrees (from -180 to 180 is meaningless, because the line is symmetrical about the center of coordinates).
Thus, sorting out all kinds of angles, we evaluate the angle of the line at which the largest number of points are at the same distance. This angle is the desired angle of inclination of the line, but it is approximate, due to the discreteness of the initial set of angles. Next, we build a linear approximation based on the obtained set of points and get the most approximate approximation of the first line.
To find the next line, the points used in the previous step are removed from consideration. Thus, line by line, we approximate all the lines.
1. A set of angles under consideration
At this stage, you need to figure out which set of angles will be moved. Knowing the features of a particular task, you can correct this set of angles, thereby speeding up the process of detecting lines. You can also take into account the features of the angle of inclination between the lines, that is, at each next step, narrow the range of angles to be sorted. In this problem, we will use a uniform set of angles from -90 to 90 degrees in increments of 0.1 degrees in the forehead.
2. Determining the distance from a point to a line
For the convenience of sorting out all kinds of line options, we need to derive a formula for estimating the shortest distance from a point to a line.
Let the equation of the line and the coordinates of the point from which we measure the distance take the following notation:
Then, since the shortest distance from a point to a line is a perpendicular to this line, we get the equation of a perpendicular passing through a given point of the following form:
Then, the intersection point of these two lines:
$$ display $$ y=k (x_p + ky_p-bk)/(k ^ 2 + 1) + b=(〖kx〗 _p + k ^ 2 y_p-bk ^ 2 + bk ^ 2 + b)/(k ^ 2 + 1)=(〖kx〗 _p + k ^ 2 y_p + b)/(k ^ 2 + 1) $$ display $$
Get the distance from the point of interest to the intersection point:
3.Building a histogram of distances
If we simply construct a histogram of distances, it will be uninformative, so we need to construct a histogram using a sliding window, so it will be smoother and we will get the result with less error (see Fig. 4-6).
From the constructed histogram, we determine the distance with the maximum number of points. Using the obtained set of the maximum number of points for each angle, we build the dependence of the number of points and the angle of inclination of the line (see Fig. 7, 8). In Fig. 7 you can see two sharp peaks, these peaks are the slopes of the lines of interest to us.
Fig. 4 Histogram of distances (erroneous line)
Fig. 5 Histogram of distances (right line)
Fig. 6 Larger histogram of distances (right line)
Fig. 7 Histogram of the distribution of the maximum number of equidistant points depending on the angle of inclination of the line in question (Step 1)
Fig. 8 Histogram of the distribution of the maximum number of equidistant points depending on the angle of inclination of the line in question (Step 2)
4. Building a linear approximation
The line angle obtained from the histogram is inaccurate, since it was obtained on a fixed grid of angles. To obtain a more accurate angle of inclination and displacement of the line of interest, it is necessary to perform a linear approximation of the obtained set of points using the following formulas (see Fig. 9 and Fig. 10):
$$ display $$ k=(N∑_1 ^ N (xy) - ∑_1 ^ Nx ∑_1 ^ Ny)/(N∑_1 ^ Nx ^ 2 - (∑_1 ^ Nx) ^ 2); b=(∑_1 ^ Ny-k∑_1 ^ Nx)/N $$ display $$
Fig. 9 The result of the approximation of the first line
Fig. 10 Second Line Approximation Result
Here are examples of recognizing lines on arbitrary sets of points (see Figure 11-13).
Fig. 11 The result of working on an arbitrary selection of points
Fig. 12 The result of working on an arbitrary selection of points
Fig. 13 The result of working on an arbitrary selection of points
Using the above algorithm, you can detect straight lines with a relatively high speed and determine their parameters (slope and offset). The number of lines can be unlimited.
For successful detection of lines, it is desirable that the points are scattered as much as possible along the coordinate system, since if all the points are nearby, then during the procedure of moving the detection from one line to another, the points of the new line will be discarded due to their proximity to the line from the previous one iteration.
The main purpose of this article is to see the view from the side, whether anyone had similar problems and how you solved them. Suddenly there are ways that can detect these lines in a single pass.See some interesting solutions that can be used in other tasks..