Software Engineering
The presented report allows us to evaluate the information situation on the theme of master's work. It is the main documentary confirmation of the depth and completeness of information retrieval, and also serves to fix the current situation in the studied area.
The search is performed using four search engines (Google, Yandex, Bing, Meta). The results are summarized in the table. In total 20 questions related to the master's work have been completed. Of these, four queries correspond to the title of master's work in four languages, four queries from the head of the first name, as well as twelve queries with key concepts on the topic of master's work.
There below are two tables with search reports that divide the time period on three months, as well as a series of charts that allow you to compare the major changes that occurred during this period.
Search string |
||||
Russian | ||||
Исследование методов классификации информации о внешнеторговой деятельности государств в рамках информационно-поисковой системы | 83000 | 167000000 | 30 | 39900 |
Коломойцева Ирина Александровна, ДонНТУ | 28700 | 6000 | 10 | 16000 |
Классификация текстов | 11300000 | 50000000 | 20000 | 5550000 |
Алгоритмы классификации текстов | 2430000 | 76000000 | 8060 | 1117000 |
Классификация внешнеторговой информации | 169000 | 7750 | 50000000 | 345000 |
Ukrainian | ||||
Дослідження методів класифікації інформації про зовнішньоторговельну діяльність держав в рамках інформаційно-пошукової системи | 9100 | 208000000 | 0 | 3550 |
Коломойцева Ірина Олександрівна, ДонНТУ | 2900 | 67000000 | 0 | 21 |
Класифікація текстів | 3340000 | 52000000 | 12100 | 1710000 |
Алгоритми класифікації текстів | 1820000 | 63600000 | 4300 | 928500 |
Класифікація зовнішньоторговельної інформації | 132000 | 26100000 | 2850 | 65000 |
English | ||||
Research of information classifying methods on international trade activity of states within the framework of an information retrieval system | 13200000 | 228000000 | 6490 | 6630000 |
Kolomoitseva Irina Aleksandrovna, DonNTU | 31000 | 10000 | 0 | 14600 |
Text classification | 343800000 | 81000000 | 7330000 | 150500000 |
Text classification algorithms | 63700000 | 86000000 | 1420000 | 89750000 |
Classification of international trade information | 763600000 | 79000000 | 3800000 | 373000000 |
Spanish | ||||
Investigación de métodos de clasificación de información sobre la actividad internacional comercial de los estados en el marco de un sistema de recuperación de información | 9020000 | 45000000 | 2300 | 4250000 |
Kolomoitseva Irina Aleksandrovna, DonNTU | 31000 | 10000 | 0 | 14800 |
Clasificación de texto | 143800000 | 5800000 | 2380000 | 71800000 |
Algoritmos de clasificación de texto | 3540000 | 6000000 | 51000 | 1720000 |
Clasificación de la información del comercio internacional | 39700000 | 22000000 | 427000 | 19500000 |
Search string |
||||
Russian | ||||
Исследование методов классификации информации о внешнеторговой деятельности государств в рамках информационно-поисковой системы | 83500 | 188000000 | 26 | 40300 |
Коломойцева Ирина Александровна, ДонНТУ | 29000 | 11000 | 8 | 16200 |
Классификация текстов | 11400000 | 66000000 | 25400 | 5600000 |
Алгоритмы классификации текстов | 2450000 | 77000000 | 12500 | 1190000 |
Классификация внешнеторговой информации | 345000 | 36000000 | 12400 | 171000 |
Ukrainian | ||||
Дослідження методів класифікації інформації про зовнішньоторговельну діяльність держав в рамках інформаційно-пошукової системи | 9160 | 197000000 | 1 | 3560 |
Коломойцева Ірина Олександрівна, ДонНТУ | 2900 | 67000000 | 3 | 18 |
Класифікація текстів | 3360000 | 56000000 | 17200 | 1730000 |
Алгоритми класифікації текстів | 1830000 | 67000000 | 4460 | 938000 |
Класифікація зовнішньоторговельної інформації | 132000 | 29000000 | 2970 | 65600 |
English | ||||
Research of information classifying methods on international trade activity of states within the framework of an information retrieval system | 13300000 | 225000000 | 6540 | 6700000 |
Kolomoitseva Irina Aleksandrovna, DonNTU | 31200 | 17000 | 0 | 14800 |
Text classification | 317000000 | 81000000 | 9550000 | 152000000 |
Text classification algorithms | 64200000 | 86000000 | 2270000 | 90700000 |
Classification of international trade information | 744200000 | 79000000 | 5900000 | 377000000 |
Spanish | ||||
Investigación de métodos de clasificación de información sobre la actividad internacional comercial de los estados en el marco de un sistema de recuperación de información | 9080000 | 50000000 | 2330 | 4280000 |
Kolomoitseva Irina Aleksandrovna, DonNTU | 31200 | 17000 | 0 | 15000 |
Clasificación de texto | 144000000 | 6000000 | 3500000 | 71800000 |
Algoritmos de clasificación de texto | 3540000 | 7000000 | 58900 | 1740000 |
Clasificación de la información del comercio internacional | 39600000 | 19000000 | 620000 | 19700000 |
Comparing the results of queries from different search engines, one can come to the conclusion that it is rather difficult to identify a single leader among search engines. But as against the background of the others stand out Yandex and Google.
You can see that the number of pages found is correlated with the search string alphabet. So for Cyrillic queries, Yandex is much better suited for the task. However, if you evaluate the search terms in Latin, then the Google search engine is much better. The search engine Bing is based on the data explicit outsider, but it is necessary to take into account the aspect that the relevancy of the found pages wasn't evaluated, so it is unrealistic to talk about the shortcomings of the system. At the same time, the search engine META q> shows quite good results, but it should be kept in mind that it is built on the basis of Google search.
The dynamics of the number of search results for each search query in each search engine in time is shown in the diagram down below.
As you can see from the diagram, the absolute majority of searches over time increased the number of found pages. Some queries saved almost identical amount of found materials, which was especially well reflected in Ukrainian and Spanish requests.
It should also be noted that some of the requests reduced the number of sites found and quite radical. This indicator suggests that search engines improve their search algorithm, filter out non-unique articles, or conduct a revision of index files.
According to the diagram, it is possible to distinguish two systems that are developing fastest in terms of the number of found pages. They are Yandex and Bing. These systems produced the greatest increases over time, namely 83% and 60%, respectively. These systems also showed the largest decrease, namely 28% and 20%, respectively. These data demonstrate that these systems are actively working with their index files and search relevancy.
According to the language principle, the undisputed leaders are Russian and English as the languages of international scientific communication, but the Spanish-speaking segment is developing very quickly at the moment. Ukrainian in this set does not produce such results, but it should be noted that it is a monocultural language by type of Japanese and Swedish, therefore, its comparison with the languages of international communication can't provide a reliable picture of the development of this language segment.