The best data products are born in the fields
Most of our
My name is Marina Calabin, I am the project manager at Leroy Merlin. She joined the company in 2011. The first five years I opened stores (when I arrived, there were 13, now 107), then I worked in the store as the head of the trading sector and for the past year and a half I have been doing the job of helping stores out of the
Since I have been working in a company for a long time, my speech is filled with specific terms, which I call "Lerulism". So that we speak the same language with you, I’m quoting some of them.
- Stock - stock of goods in the store.
- Available for sale stock - the amount of goods free from locks and reserves for the client.
- Expo - a display case.
- Articles - products.
- Operational inventory - daily recount of 5 items in each department of each store.
You may not know, but when you place an order with Leroy Merlin, in 98% of cases it comes to the store and is collected from the sales area.
Imagine a huge 8,000 sq. m. m store, 40 000 items and the task of collecting the order. What can happen to the articles of your order that the collector is looking for? The product may already be in the basket of the client who walks around the trading floor, or may even be sold between the moment when you ordered it and the moment the collector went after it. There is a product on the site, but in reality it is either
In order to deal with various problems, including this one, last year the company launched the Data Accelerator division. His mission is to instill a
The essence of the product is that before publishing the stock of goods on the site, we check whether we can collect this article for the client, whether we guarantee it. Most often this is achieved with a slightly smaller amount of stock that we publish on the site.
We had a cool team: Data Scientist, Data Engineer, Data Analyst, Product Owner and
Our product goals were:
- reduce the number of unassembled orders, while not harming the number of orders in principle (so that it does not decrease);
- keep the turnover in eCom, because we will display less goods on the site.
In general, other things being equal, it’s better to do it.
Bureau of Investigation
When the project started, we went to the shops, to the people who work with it every day: we ourselves went to collect orders. It turned out that our product is so interesting and needed by stores that we were asked to start not in 3 months, as was planned at the beginning, but twice as fast, that is, in 6 weeks. To put it mildly, it was stressful, but nonetheless...
We collected hypotheses from experts and went looking for what kind of data sources we have in principle. It was a separate quest.In fact, the "bureau of investigation" showed that we have such products that must have a display case.
For example, a mixer - such products always have a sample in the hall. Moreover, we do not have the right to sell the expo, because it may already be damaged and the warranty does not apply to it. We found such goods that do not have a storefront sample and the available stock for sale is shown 1. But, most likely, this is the same expo that we cannot sell. And the client can order it. This is one of the problems.
The next story is the opposite. We have found that sometimes there are too many display cases for goods. Most likely, either the system crashed, or the human factor intervened. Instead of displaying 2500 installation boxes on the site, we can only show 43, because we have a system failure. And we taught our algorithms to find, among other things, such “jambs."
After examining the data, we collected
As for the examples, when we found too many display cases, in almost 60% of cases we were right, suggesting an error. And when we looked for an insufficient number of expos or their absence, we were right in 81%, which, in
Starting MVP. Stage One
Since we had to meet the deadline of 6 weeks, we ran a proof of concept with such a linear algorithm that found abnormal values, corrected for these values before publishing to the site. And we had two stores, in two different regions, so that we could compare the effect.
In addition, a dashboard was made, where, on the one hand, we monitored the technical parameters, and on the other, we showed our customers, in fact the stores, how our algorithms work out. That is, we compared how they worked before the launch and how they began to work after, showed how much money the use of these algorithms allows you to earn.
Rule "-1". Second stage
The effect of the product’s work quickly became noticeable, and we were asked why we process such a small number of articles: “Let's take the entire stock of the store, subtract one piece from each article, and maybe this will allow us to solve the problem globally”. By this moment, we had already begun working on a machine learning model, it seemed to us that such a “carpet bombardment” could do much harm, but we did not want to miss the opportunity of such an experiment. And we ran a test in 4 stores in order to test this hypothesis.
When a month later we looked at the results, we found out two important circumstances.
ML model . Third Stage
So, we made the
- The model is implemented using gradient boosting on Catboost, and this predicts the likelihood that the stock of goods in this store is currently incorrect.
- The model was trained on the results of operational and annual inventories, including data on canceled orders.
- As indirect indications of the possibility of an incorrect flow, such signs were used as data on the latest stock movements of a given product, on sales, returns and orders, on stock available for sale, on the nomenclature, on some characteristics of the product, etc.
- In total, about 70 features were used in the model.
- Among all the attributes, important ones were selected using various approaches to assessing importance, including Permutation Importance and approaches implemented in the Catboost library.
- To check the quality and select model hyperparameters, the data were divided into test and validation samples in a ratio of 80/20.
- The model was trained on older data, and tested on newer.
- The final model, which eventually went to the prod, was trained on a full dataset using hyperparameters selected using the train/
- The model and data for training the model are versioned using the DVC , versions of the model and datasets are stored on S3.
The final metrics of the resulting model on the validation dataset:
- Recall: 0.77
A bit about architecture - how it is implemented in the prod. To train the model, we use replicas of the company's operating and product systems, consolidated in a single DataLake on the GreenPlum platform. Based on the replicas, features stored in MongoDB are calculated, which allows you to organize hot access to them. Feature calculation orchestration and integration of GreenPlum and MongoDB is implemented using the
The machine learning model is a containerized
We had 6 stores and the results showed that out of the planned 15%, we were able to reduce the number of unassembled orders by 12%, while our
At the moment, the model we trained is used not only to edit the flow before publishing on the site, but also to improve operational inventory algorithms. What articles you need to calculate today in this department, in this store - those that customers will come for and which would be good to check. In general, the model turned out to be also multifunctional and is reused in the company in other divisions.
psThe article was written on a speech at the Avito.Tech meeting, you can watch the video using the link .