It happened to you: look for something on Google and enter after the search word “vs”, hoping that the search engine will automatically offer you something a bit similar to what you need?

ITKarma picture

Entering “vs” after the search word

This has happened to me.

As it turned out, this is a big deal. This is a reception which, when looking for an alternative to something, can save a ton of time.

I see 3 reasons why this technique shows itself perfectly if it is used to search for information about technologies, certain developments and concepts that they want to understand:

  1. The best way to learn something new is to find out how it is, new, similar to what is already known, or how the new differs from the known. For example, in the list of sentences that appears after “vs”, you can see something about which you can say: “And, so, it turns out that what I'm looking for, looks like this is already familiar to me.”
  2. This is a simple trick. In order to use it you need, literally, a few seconds.
  3. The word “vs” is a clear indication telling Google that the user is interested in directly comparing something with something. Here you can use the word "or", but it does not so strongly express the intention to compare something with something. Therefore, if you use "or", Google will display a list of offers, in which something outsider is more likely to appear.

ITKarma picture

When processing a bert or request, Google makes suggestions regarding Sesame Street. And the bert vs query gives tips on Google BERT

It made me think. But what if we take the words that Google suggested after entering “vs” and search in them, also adding “vs” after them? What if you repeat this several times? If so, you can get a nice network graph of related queries.

For example, it might look like this.

ITKarma picture

Ego graph for a bert request with a radius of 25

This is a very useful technique for creating mental maps of technologies, developments, or ideas that reflect the interconnectedness of such entities.

I’ll tell you how to build such graphs.

Automating the collection of “vs” data from Google


Here is a link that you can use to get suggestions from Google for query completion in XML format from Google. This feature does not look like the API is intended for widespread use, so you probably shouldn't lean too much on this link.

http://suggestqueries.google.com/complete/search?&output=toolbar&gl=us&hl=en&q=<search_term> 

The CDMY0CDMY URL parameter indicates that we are interested in XML results, CDMY1CDMY sets the country code, CDMY2CDMY allows you to specify the language, and the CDMY3CDMY construct is just what you need to get the completion results for.

The CDMY4CDMY and CDMY5CDMY parameters use the standard two-letter identifiers of the countries and languages ​​ .

Let's experiment with all this by starting the search, say, with a query CDMY6CDMY.

The first step is to go to the specified URL, using the following construction that describes the request: CDMY7CDMY. The entire link will look like this:

http://suggestqueries.google.com/complete/search?&output=toolbar&gl=us&hl=en&q=tensorflow%20vs%20 

In response, we get the XML data.

What to do with XML?


Now you need to check the results of the completion of the completion for compliance with a certain set of criteria. With those that suit us, we will continue to work.

ITKarma picture

Verification of the results

When checking the results, I used the following criteria:

  • The recommended search query should not contain the text of the original query (that is, CDMY8CDMY).
  • The recommendation should not include requests that were deemed eligible (for example, CDMY9CDMY).
  • The recommendation should not include a few words “vs.”
  • After 5 matching searches are found, all the rest are no longer considered.

This is just one of the ways to "clean up" the list of recommendations for completing a search query received from Google. In addition, I sometimes see the benefit of choosing from the list only recommendations consisting solely of one word, but the use of this technique depends on each specific situation.

So, using this set of criteria, we got 5 of the following results, each of which is assigned a specific weight.

ITKarma picture

5 results

Next iteration


Then, these 5 found recommendations are subjected to the same processing as the initial search query. They are passed to the API using the word “vs” and again 5 autocompletion results are selected that meet the above criteria. Here is the result of this processing of the above list.

ITKarma picture

Search for completion results for already found words

You can continue this process by examining words that have not yet been explored from the CDMY10CDMY column.

If you conduct a lot of iterations of such a word search, you get a rather large table containing information about queries and weights. This data is well suited for graph visualization.

Ego graphs


The network graph that I showed you at the beginning of the article is the so-called ego graph built, in our case, for query CDMY11CDMY. An ego graph is a graph with all nodes at some distance from the CDMY12CDMY node. This distance must not exceed the specified distance.

And how is the distance between nodes determined?

Let's look at the finished graph first.

ITKarma picture

Ego graph for tensorflow query with radius 22

The weight of the edge connecting the query CDMY13CDMY and CDMY14CDMY, we already know. This is the rank of the recommendation from the completion list, varying from 1 to 5. In order to make the graph non-oriented, you can simply add the weights of the links between vertices going in two directions (that is, from CDMY15CDMY to CDMY16CDMY, and, if there is such a connection, from CDMY17CDMY to CDMY18CDMY). This will give us edge weights ranging from 1 to 10.

The edge length (distance) will thus be calculated using the formula CDMY19CDMY. We chose the number 11 here because the maximum weight of the edge is 10 (the edge will have this weight if both recommendations appear at the very top of each other's completion lists). As a result, the minimum distance between requests will be 1.

The size (size) and color (color) of the graph vertex is determined by the number (count) of cases in which the corresponding request appears in the list of recommendations. The result is that the larger the peak, the more important the concept presented to it.

The ego graph in question has a radius of 22. This means that you can get to each request from the top of CDMY20CDMY by passing a distance not exceeding 22. Let us take a look at what happens if you increase the radius of the graph to 50.

ITKarma picture

Ego graph for tensorflow query with radius 50

It turned out interestingly! This graph contains most of the basic technologies that those involved in artificial intelligence should know about. The names of these technologies are logically grouped.

And all this is built on the basis of one single keyword.

How to draw such graphs?


I used the Flourish tool to draw such a graph.

This service allows you to build network diagrams and other diagrams using a simple interface. I suppose it's worth taking a look at for those who are interested in building ego graphs.

How to create an ego graph with a given radius?


To create an ego graph with a given radius, you can use the CDMY21CDMY Python package. It has a very convenient function CDMY22CDMY. The radius of the graph is indicated when this function is called.

import networkx as nx #Формат исходных данных #nodes=[('tensorflow', {'count': 13}), # ('pytorch', {'count': 6}), # ('keras', {'count': 6}), # ('scikit', {'count': 2}), # ('opencv', {'count': 5}), # ('spark', {'count': 13}),...] #edges=[('pytorch', 'tensorflow', {'weight': 10, 'distance': 1}), # ('keras', 'tensorflow', {'weight': 9, 'distance': 2}), # ('scikit', 'tensorflow', {'weight': 8, 'distance': 3}), # ('opencv', 'tensorflow', {'weight': 7, 'distance': 4}), # ('spark', 'tensorflow', {'weight': 1, 'distance': 10}),...] #Построить исходный полный граф G=nx.Graph() G.add_nodes_from(nodes) G.add_edges_from(edges) #Построить эго-граф для 'tensorflow' EG=nx.ego_graph(G, 'tensorflow', distance='distance', radius=22) #Найти двусвязные подграфы subgraphs=nx.algorithms.connectivity.edge_kcomponents.k_edge_subgraphs(EG, k=3) #Получить подграф, содержащий 'tensorflow' for s in subgraphs:     if 'tensorflow' in s:         break pruned_EG=EG.subgraph(s) ego_nodes=pruned_EG.nodes() ego_edges=pruned_EG.edges() 

In addition, I used another function here - CDMY23CDMY. It is used to remove some results that do not meet our needs.

For example, CDMY24CDMY is an open-source framework for distributed computing in real time. But this is also a character from the Marvel universe. What search suggestions will “win” if you enter “storm vs” into Google?

The CDMY25CDMY function finds vertex groups that cannot be split by performing CDMY26CDMY or fewer actions. As it turned out, here the parameters CDMY27CDMY and CDMY28CDMY show themselves well. As a result, only those subgraphs to which CDMY29CDMY belongs are left. This allows us to ensure that we are not too far from where we started the search and that we are not going to too distant areas.

Using ego graphs in life


Let's move away from the CDMY30CDMY example and look at another ego graph. This time, a graph dedicated to something else that interests me. This is the chess debut, dubbed the “Spanish Party” (Ruy Lopez chess opening).

▍Study of Chess Openings


ITKarma picture

Study of the Spanish Party (ruy lopez)

Our technique allowed us to quickly discover the most common opening ideas, which can help a chess researcher.

Now let's look at other examples of using ego graphs.

▍Healthy foods


Cabbage! Yummy!

But what if you had a desire to replace a beautiful, incomparable cabbage with something else? The ego graph built around the cabbage (CDMY31CDMY) will help you with this.

ITKarma picture

Ego graph for a kale query with a radius of 25

▍Buying a dog


There are so many dogs, and so little time... I need a dog. But which one? Maybe something like a poodle (CDMY32CDMY)?

ITKarma picture

Ego graph for poodle query with radius 18

▍Looking for love


Dog and cabbage do not change anything? Need to find your soulmate? If so - here is a small but very self-sufficient ego graph that can help with this.

ITKarma picture

The ego graph for the query coffee meets bagel with a radius of 18

▍What should I do if the dating apps didn’t help?


If dating apps are useless, you should, instead of hanging in them, watch the series, stocking up with ice cream with the taste of cabbage (or with the taste of a recently discovered arugula). If you like The Office (certainly the one shot in the UK), you might like some other shows.

ITKarma picture

Ego graph for the office query with a radius of 25

Summary


This concludes the story about the use of the word “vs” in a Google search and about ego graphs. I hope all of this helps you at least a little in finding love, a good dog and healthy food.

Do you use any unusual tricks when searching the Internet?

ITKarma picture.

Source